Introduction

Exposure to ambient particulate matter is the seventh highest health risk factor globally and is estimated to have caused 4.1 million premature deaths in 2019 (McDuffie et al. 2021; Murray et al. 2020). Power generation, particularly from the combustion of coal and other fossil fuels, makes a major contribution to ambient fine particulate matter (PM2.5) concentrations and is estimated to be responsible for 10–14% of PM2.5-related premature mortalities globally (Lelieveld et al. 2015; McDuffie et al. 2021; Silva et al. 2016). Global chemical transport modelling studies identify China (Tian et al. 2017), eastern Mongolia and Korea (McDuffie et al. 2021), India (Cropper et al. 2021), eastern Europe (Kushta et al. 2021), the USA (Caiazzo et al. 2013; Thind et al. 2019), eastern Mexico (McDuffie et al. 2021) and the north-eastern parts of South Africa as regions with significant population exposure to PM2.5 from power generation emissions. Both Lelieveld et al. (2015) and McDuffie et al. (2021) identify power generation as the largest contributor of all sources to ambient PM2.5 concentrations (and associated premature mortalities) in the north-eastern region of South Africa.

South Africa is the sixth highest generator of electricity from coal in the world (BP 2022). Efforts have been underway to reduce emissions from coal-fired power stations for more than a decade now, both through replacing coal-fired generation with renewables (Akinbami et al. 2021) and through the enacting of emission standards (Altieri and Keen 2019), but progress has been slow. The Renewable Energy Independent Power Producer Procurement Programme (REIPPP) was initiated in 2011 to give effect to the renewable energy expansion plans envisioned in the Integrated Resource Plan (Department of Mineral Resources and Energy 2011), but in 2022 coal still contributed 80% of South Africa’s grid electricity, relative to only 7% generated from solar and wind (Pierce and le Roux 2023).

Emissions standards were legislated in 2010 in South Africa, requiring coal-fired power stations to comply with emissions limits for SO2 of 500 mg/Nm3 (where a normal cubic metre (Nm3) is dry gas under normal conditions of 273 K, 101.3 kPa calculated to a reference O2 content of 10%), for NOx of 750 mg/Nm3 and for PM of 50 mg/Nm3 by 2020. The cash-strapped national utility has not installed most of the emission abatement retrofits required for compliance and has instead requested successive postponements and then suspension of the emission limits (Euripidou et al. 2022). As a result, emissions of NOx per unit of electricity generated only decreased by 2.9% between 2012 and 2022, and emissions of SO2 increased by 4.3% over the 10-year period (Eskom 2022).

Estimates of the contribution of power generation to ambient PM2.5 concentrations and premature mortality in South Africa vary widely. McDuffie et al. (2021) and Marais et al. (2019) use chemical transport models to calculate ambient PM2.5 concentrations derived from power station emissions, at relatively coarse resolution. McDuffie et al. (2021) estimate that coal combustion for electricity generation contributes 20.5% of ambient PM2.5 over the country, more than any other source. They estimate total attributable premature mortality to range between 25,035 and 34,898 deaths a year. Marais et al. (2019) calculate that there will be 10,400 additional deaths in South Africa in 2030 relative to 2012 due to increases in fossil fuel use for power generation and transport (although it now seems that many of the planned new developments will not materialise because electricity generation from fossil fuels is decreasing while that from renewables is increasing).

In contrast, dispersion model studies using a smaller domain and finer model resolution calculate many fewer premature mortalities arising from exposure to PM2.5 from coal-fired power stations (Langerman and Pauw 2018). Steyn and Kornelius (2018) estimate around 32 premature deaths a year due to exposure to PM2.5 formed from SO2 emissions in excess of 500 mg/Nm3, from 11 coal-fired power stations in Mpumalanga only. Gray (2019) calculates that there were between 268 and 571 premature deaths (mean estimate: 399 premature deaths) due to SO2, NOx and PM emissions from the 12 coal-fired power stations in Mpumalanga and the northern Free State in 2016. Myllyvirta (2014) calculates between 2238 and 2731 premature mortalities a year (range: 729–5117) due to PM2.5 from all power station emissions in excess of the 2020 emission limits.

These dispersion modelling studies have only considered exposure to PM2.5 derived from one source—coal-fired power stations—in their calculations of premature mortalities. This introduces inaccuracies because the exposure-response functions are applied at ambient concentrations below the counterfactual PM2.5 concentrations at which they were derived. The maximum annual average ambient PM2.5 concentrations from coal-fired power stations are modelled to be less than 3.4 mg/3 (Gray 2019; Myllyvirta 2014), while the counterfactual concentrations are 5.5 mg/3 for the exposure-response functions derived from the American Cancer Society Study (Pope et al. 2002) and range between 2.4 and 5.9 mg/3 for Burnett et al.’s (2018) Integrated Exposure Response function.

Furthermore, health impacts are a function of total exposure levels, not just concentration levels from one source. In the past, a linear relationship between exposure and premature deaths was assumed (Pope et al. 2002). The linear relationship is only valid at low exposure levels however (Pope et al. 2009), and a sub-linear relationship better reflects the “flattening out” of the relationship between exposure and premature mortalities at higher levels of exposure. Put another way, the first microgram of particulate matter inhaled has a much larger impact on the number of premature mortalities than the last microgram inhaled. The Integrated Exposure Response functions derived considering exposure to PM2.5 in ambient, household and smoker environments (Burnett et al. 2018, 2014) more accurately represents the relationship between premature mortalities and exposure in polluted environments typical of the developing world.

Studies that only consider emissions and resulting ambient PM2.5 concentrations from one source and apply the exposure-response function at low ambient levels (for example Fann et al. 2013; Howard et al. 2019; Levy et al. 2009; Strasert et al. 2019; Thind et al. 2019) usually overestimate the health effects of that source because of the sub-linear relationship at higher exposure levels. Other studies calculate the actual health benefit that would be realised by an emission reduction intervention by applying the exposure-response function at ambient PM2.5 levels (Cropper et al. 2019; Kodros et al. 2016; Kushta et al. 2021; Marais et al. 2019). However, this provides an incentive for those paying for the emission reduction to delay their expenditure. Greater incremental reductions in premature mortalities will be realised when ambient PM2.5 concentrations have decreased due to reductions from other emission sources. We prefer the approach of Cropper et al. (2021) who argue that each source should be treated as the “average emitter”. This is achieved by calculating the total number of premature moralities due to exposure to ambient PM2.5 levels, and then attributing deaths to coal-fired power station emissions based on the fraction of PM2.5 that they contribute to total PM2.5 concentrations.

This study aims to improve on previous assessments of premature mortality attributed to exposure to PM2.5 derived from coal-fired power stations in South Africa by considering coal-fired power stations to be average emitters in the context of all sources. South Africa is a developing country and relatively high ambient PM2.5 concentrations occur in some places, especially low-income residential areas where solid fuels are used (Hersey et al. 2015). The significance of the method of application of the exposure-response function in health impact calculations is assessed in the South African environment by comparing the number of attributable premature mortalities to the number of avoidable premature mortalities and the number of premature deaths calculated assuming a linear relationship between PM2.5 exposure and premature mortalities. In order to estimate average PM2.5 concentrations as accurately as possible in highly populated low-income areas, PM2.5 concentrations are interpolated across the domain in north-eastern South Africa using measurements from 38 ambient monitoring stations and a regression model based on household socio-economic factors.

Study area, data and methodology

In this study, the number of premature deaths due to exposure to secondary PM2.5 formed from SO2 and NOx emissions from coal-fired power stations is calculated for a region centred over the industrialised Highveld in the north-east of South Africa for 2019. Total ambient PM2.5 concentrations are obtained from measurements at 38 ambient air quality monitoring stations (Figure S1) and spatially interpolated at ward level using a regression model based on household socio-economic factors. PM2.5 concentrations derived from coal-fired power station emissions are calculated using the reduced-form intake fraction model of Zhou et al. (2006). Premature mortality due to exposure to coal-fired power station PM2.5 is then calculated by three methods: considering power stations to be average emitters (attributable deaths); assuming the exposure-response function is applied as a linear function (as is done when other emissions are not considered, termed “single-source deaths” in this study) and at ambient PM2.5 concentrations which represents the health benefit from reducing the emissions of the power stations while other sources are unchanged (avoidable deaths). All calculations are performed at the ward level. Wards are geopolitical subdivisions of municipalities used for electoral purposes and collection of statistical data. The average area of a ward in the study area is 140 km2.

Study area

The study domain is a roughly circular area 480 km in diameter in the north-eastern region of South Africa (Fig. 1). The study area encompasses 13 coal-fired power stations, mining activities, industries and smelters in the vicinity of the mines, and Gauteng, South Africa’s most urbanised and populous province. Approximately 23 million of South Africa’s 59 million people live in the study area. The altitude in the study area ranges between around 700 m in the north to over 2300 m, with most of the study area being on the interior plateau called the Highveld. Regions with lower altitude, such as the Lowveld in the extreme eastern and northern parts of the country, were excluded as particulate matter originating on the Highveld is unlikely to be advected to ground level at much lower altitudes due to the presence of a persistent absolutely stable layer at the 850-hPa level, which restricts vertical transport of pollutants (Freiman and Piketh 2003).

Fig. 1
figure 1

The study area, shaded in grey. Bold lines show provincial borders and thin lines show ward boundaries. The study area covers parts of the Limpopo (LM), Mpumalanga (MP), Gauteng (GT), Free State (FS) and North-West (NW) provinces. Locations of coal-fired power plants are from Eskom (2015)

The study area is situated largely in the subtropics, between 22.5°S and 27.75°S. Rainfall occurs during summer, mainly in association with tropical easterly flow. Anticyclonic conditions dominate in winter, bringing calm, clear conditions. Surface radiation inversions are common at night and there are persistent elevated absolutely stable layers found at ~ 700 hPa and ~ 500 hPa (altitudes of around 3 km and 5 km). The calm conditions and atmospheric stability prevent effective dispersion of pollutants (Tyson and Preston-Whyte 2000). Moreover, pollution concentrations are elevated by atmospheric recirculation that is estimated to occur around 33% of the time (Freiman and Piketh 2003).

Calculation of total ambient PM2.5 concentrations

Total annual ambient PM2.5 concentrations in the study area were estimated from measurements at 38 ambient monitoring stations (Figure S1) and interpolated across all wards using a regression model based on household socio-economic factors. This approach was adopted because secondary PM2.5 concentrations have only been poorly modelled in South Africa to date, probably because emission inventories are not well quantified, and also because it was deemed important to adequately represent ambient PM2.5 concentrations in populous low-income residential communities where many people are exposed to high PM concentrations. The regression model was developed for 2011 since the last comprehensive census in South Africa collecting socio-economic data was conducted in 2011. The ambient monitoring stations are located across the study area but concentrated in the cities in Gauteng and the industrial regions on the Mpumalanga Highveld, northern Free State and the Waterberg area in Limpopo. In order to ensure accurate PM2.5 averages based on sufficient data, concentrations were sourced for 2010 to 2013. Raw data was obtained from the South African Air Quality Information System (SAAQIS, https://saaqis.environment.gov.za/) and quality-assured averages were obtained from Khumalo (2016), Feig et al. (2019) and Govender and Sivakumar (2019) (Table S1).

In South Africa, highest ambient PM concentrations are found in low-income residential areas (Hersey et al. 2015). Internationally, too, high ambient PM2.5 concentrations are associated with informal settlements, high population density and the residential use of solid and liquid fuels (Egondi et al. 2016; Lin et al. 2013; Lou et al. 2016; Saucy et al. 2018). The variables that were tested for an association with PM2.5 concentration at ward level are household income level, population density, type of dwelling, family size per household, fuel type used for heating, fuel type used for cooking and gender of household head. The strength of the relationship between ambient PM2.5 concentrations and the selected socio-economic variables was tested using the Pearson correlation coefficient. The factor in each category with the highest Pearson correlation coefficient with annual average PM2.5 concentrations was selected for use in the regression model. The dependence of ambient PM2.5 concentrations on these selected socio-economic factors was then quantified using a multivariate linear regression, using IBM SPSS Statistics for Windows, Version 26.0.

Other sources such as vehicles and industrial emissions which contribute to ambient PM2.5 concentrations are not taken into account when using household socio-economic variables as predictors of PM2.5 concentrations. Secondary PM from industrial activities is more uniformly distributed because the precursors are often emitted from tall stacks and so are taken into account to some extent in the background concentrations (constant term in the regression equation). It is likely that ambient PM2.5 concentrations are underpredicted using this method in areas that are heavily influenced by vehicle or industrial emissions.

Calculation of PM2.5 concentrations from coal-fired power generation

Ambient PM2.5 concentrations derived from coal-fired power station emissions are estimated using the intake fraction-based regression model of Zhou et al. (2006), after Myllyvirta (2014). The regression model calculates the fraction of primary emissions and secondary PM that are inhaled, based on population residing at various distances from the source and annual average precipitation. The regression model was derived from modelling the transport and chemical transformation of power plant emissions in China using the CALPUFF atmospheric dispersion model. The model predicts that 50% of the total intake fraction occurs within around 200 km of the source for primary PM10, and within 1018 km and 1172 km of the source for secondary sulphate and nitrate particles, respectively (Zhou et al. 2006). Emissions of SO2, NOx and PM10 in excess of the 2020 emission standards are considered for Eskom’s 12 large coal-fired power stations (no emissions from Kusile are included since Kusile was built to operate in full compliance with the new plant emission standards; Table 1). PM2.5 concentrations were assigned per ward.

Table 1 Actual emissions and emissions in excess of the 2020 Minimum Emission Standards for Eskom’s 12 large coal-fired power stations

Calculation of premature mortalities from power station emissions

Premature deaths arising from the inhalation of PM2.5 formed from power station emissions are calculated in three ways in order to test the sensitivity of the results to the application of the exposure-response function (Fig. 2):

  1. i)

    Attributable deaths are calculated by treating each source as an average emitter (Cropper et al. 2021). Total premature deaths from exposure to PM2.5 derived from all sources are calculated using a log-linear exposure-response function (Burnett et al. 2014) and multiplied by the fraction of ambient PM2.5 attributable to coal-fired power stations. We consider this to be the most appropriate way of calculating the health impact of a single source.

  2. ii)

    Avoidable deaths are those that would be realised if, in this case, emissions from power stations were reduced to comply with the new plant emission limits. Avoidable deaths are calculated using a log-linear integrated exposure-response function that flattens out at higher PM2.5 concentrations (Burnett et al. 2014), and each source is treated as contributing the last micrograms inhaled (the marginal impact). Applying this method creates an incentive for delayed implementation of emission reduction measures and underestimates the health cost if all sources are considered in such a manner.

  3. iii)

    Single-source deaths were calculated by earlier studies that only considered the single source of concern and assumed a linear relationship between PM2.5 exposure and the number of premature mortalities (e.g. Andersson et al. 2009; Brandt et al. 2013; Caiazzo et al. 2013; Cropper et al. 2012). These exposure-response functions were derived in areas with relatively low ambient PM2.5 concentrations (Krewski et al. 2009; Pope et al. 2002). This method avoids the need to consider emissions from other sources. The linear method effectively treats all sources as the first microgram(s) inhaled. However, because there is in fact a sub-linear relationship between PM2.5 concentrations and the number of premature deaths at higher concentration, this results in an overestimation of actual deaths if all sources are treated in this way.

Fig. 2
figure 2

Illustration of the differences in calculating the number of premature deaths due to a reduction in exposure to PM2.5 of 5 mg m3 in three different ways: avoidable deaths reflect the marginal impact assuming all other emissions stay the same; attributable deaths are calculated by treat each source as an average emitter and single-source deaths are calculated by only considering the single source of concern. The global relationship between total attributable mortality and PM2.5 concentration was derived by Apte et al. (2015) from Burnett et al.’s (2014) Integrated Exposure Response functions for five endpoints: stroke, ischaemic heart disease, chronic obstructive pulmonary disease and lung cancer for adults, and acute lower respiratory infections in children

Exposure-response functions are applied to estimate premature mortalities attributable to PM2.5 exposure on a ward basis. The following illnesses are considered: ischaemic heart disease (IHD), stroke, chronic obstructive pulmonary disease (COPD), lung cancer and lower respiratory infection (for children under 5 years). The log-linear relative risk model used for the calculation of attributable deaths was adopted from Burnett et al.’s (2014) Integrated Exposure Response (IER) function. The IER function more accurately estimates premature mortalities at high PM2.5 concentrations and so is suitable for use in areas like South Africa’s solid fuel-using residential areas where annual average PM2.5 concentrations exceed 40 mg/m3. The IER relative risk information from Burnett et al. (2014) for people aged 25 years was selected since the median age in South Africa is 27 years according to mid-2019 population estimates (Statistics South Africa 2019). The counterfactual concentration is the PM2.5 concentration below which there is negligible increased risk of mortality from PM2.5 exposure based on evidence from epidemiologic studies and is between 5.8 and 8.8 mg/m3 for the IER function.

Premature deaths attributable to coal-fired power plant emissions are determined by calculating total premature deaths from exposure to ambient PM2.5 concentrations, and then ascribing a proportion of these deaths to coal-fired power stations based on the fraction of PM2.5 that coal-fired power stations contribute to the total ambient PM2.5 concentration:

$${Mortality}_{CFPS}={Mortality}_{ambient}\times \frac{{PM_{2.5}}_{CFPS}}{{PM_{2.5}}_{ambient}}$$
(1)

where

MortalityCFPS:

premature deaths attributable to exposure to PM2.5 derived from coal-fired power stations

Mortalityambient:

premature deaths attributable to ambient PM2.5 exposure

PM2.5CFPS:

annual average PM2.5 concentration (μg/m3) from coal-fired power stations

PM2.5ambient:

 annual average ambient PM2.5 concentration (μg/m3) (from regression model)

Premature mortality due to exposure to ambient PM2.5 concentrations was calculated by multiplying the exposure-response function with the population size:

$${Mortality}_{ambient}=\kern0.5em ERF\times \kern0.5em population\ size$$
(2)

where the exposure response function (ERF) is calculated as:

$$ERF= AF\times incidence\ of\ disease$$
(3)

and the proportion of incidence of disease in the population that is attributable to the risk factor, the attributable fraction (AF), is calculated as:

$$AF={RR}_{\mathit{\log}- lin}-\frac{1}{RR_{\mathit{\log}- lin}}$$
(4)

where RRlog − lin is the log-linear relative risk derived by Burnett et al. (2014). Relative risk is the probability of being diagnosed with a certain illness when exposed to certain concentrations of a pollutant for a certain period compared to the probability of being diagnosed with the illness in the absence of exposure.

The linear-exponential risk model used for the calculation of single-source deaths was adopted from the American Cancer Society study (Krewski et al. 2009), and the number of premature mortalities is calculated using Eqs. (2), (3), (4). The number of avoidable deaths was calculated as the difference between the number of premature mortalities at ambient PM2.5 concentrations, and the number of premature mortalities at ambient PM2.5 concentrations less the contribution from coal-fired power stations.

The population incidence of disease in South Africa was acquired from the Global Burden of Disease 2017 study (IHME 2019) for ischaemic heart disease, lung cancer, chronic obstructive pulmonary disease, stroke and lower respiratory infection (for children under 5 years) (Table 2). Although the population incidence of disease from the South African National Burden of Disease 2013 study (Pillay-van Wyk et al. 2016) is similar, it was not used because the GBD 2017 values are more recent and the South African values do not have a population incidence of disease for lung cancer.

Table 2 Population incidence of selected diseases in South Africa in 2017 (IHME 2019)

Population counts per ward for 2019 were calculated by multiplying the 2011 ward population from the last census with the applicable district municipal population growth factor between 2011 and 2019 obtained from the mid-year population estimates for district municipalities (Statistics South Africa 2019).

Limitations of methodology

The two major sources of uncertainty in calculating the health impacts of PM2.5 exposure are the estimates of ambient PM2.5 concentrations and the choice and method of application of exposure-response function. The predictions of power station-derived PM2.5 concentrations in this study are rough approximations at best. Much better predictions would have been obtained using a chemical transport model (e.g. van der Walt (2023), unpublished) and it would have been even better to use satellite retrievals in conjunction with the chemical transport model. The estimates of total ambient PM2.5 are accurate to at most 67%, at the monitoring stations, and are expected to be less accurate as distance from the monitoring stations increases. Emissions of PM precursors from industries and vehicles are not accounted for in the regression model, and the effect of local air circulations and micro-meteorology on pollutant dispersion is not taken into account. The exposure-response functions used have not been validated on the South Africa population. The high prevalence of HIV in the South African population (13.7% in 2021) (Statistics South Africa 2022), for example, may alter the susceptibility of South Africans to specific diseases. PM2.5 is also linked to increases in the incidence of other diseases, e.g. type II diabetes, but these health outcomes are not included due to a lack of information regarding the health outcome’s exposure-response function for the relative risk models.

Results

Total ambient PM2.5 concentrations

Annual average PM2.5 concentrations measured at the 38 monitoring stations in north-eastern South Africa in 2011 range between 14 and 120 mg/m3 and average 33 mg/m3 (Table S1). The two stations with the lowest annual averages are located in the Waterberg District Municipality in Limpopo (near Matimba and Medupi Power Stations). Six of the seven highest annual PM2.5 averages are recorded in low-income residential areas.

The following multivariate regression model was derived relating annual average PM2.5 concentration to household socio-economic factors per ward in the study area for 2011:

$${PM}_{2.5}\ \left( ug/{m}^3\right)=7.80+\left(0.0021\ast {x}_1\right)+\left(34.22\ast {x}_2\right)+\left(78.29\ast {x}_3\right)+\left(11.94\ast {x}_4\right)$$
(5)

where

x1:

population density (number of people/km2)

x2:

households with no income (% of total households)

x3:

households using paraffin for heating (% of total households)

x4:

formal dwellings (% of total households)

In the development of the multivariate regression model, socio-economic factors having the highest Pearson correlation coefficient with annual average ambient PM2.5 concentrations (μg/m3) are the population density (0.79), percentage of households heating with paraffin (0.68), households with no income (0.65) and percentage of formal dwellings (− 0.43) (Table 3). The percentage of informal dwellings had a higher correlation with ambient PM2.5 concentration (0.57) than the percentage of formal dwellings but was not included because it is not independent of the percentage of households cooking with paraffin. Informal dwellings and households heating with paraffin have a Pearson correlation coefficient of 0.92.

Table 3 Socio-economic factors in each category, as measured during Census 2011, with the highest Pearson correlation coefficient with annual average PM2.5 concentrations at the ward level in north-eastern South Africa

The regression model (Eq. (1)) has an R2 value of 0.672, which means 67% of the variance in observed PM2.5 concentrations at the ambient air quality monitoring stations is explained by the socio-economic factors considered. Other sources that have not been included, such as vehicle and industrial emissions, are taken into account to some extent in the baseline level (constant term in Eq. (1)).

Annual average PM2.5 concentrations estimated with the regression model per ward range between 14 and 154 mg/m3 in the study area (Fig. 3). Some of the highest PM2.5 concentrations are found in Gauteng, the most densely populated area, and the three priority areas. Priority areas, as declared by the Minister of Environmental Affairs, are regions where there is believed to be non-compliance with ambient air quality standards, and additional emission control measures are required (Fig. 3). High PM2.5 concentrations estimated in the southern parts of the Waterberg-Bojanala Priority Area are due to a higher number of households with no income and need to be validated against measurements.

Fig. 3
figure 3

Annual average ambient PM2.5 concentrations (μg/m3) per ward estimated by the socio-economic regression model over the study area. Green, red and blue boundaries indicate the Highveld Priority Area (HPA), Waterberg-Bojanala Priority Area (WBPA) and Vaal Triangle Airshed Priority Area (VTAPA), respectively. Black lines are provincial boundaries

PM2.5 concentrations from power station emissions

Annual average PM2.5 concentrations derived from power station emissions are crudely estimated using Zhou et al.’s (2006) reduced-form intake fraction model to be at most 1.81 mg/m3 over the study area (Fig. 4). The secondary PM2.5 is predominantly sulphate compounds formed from SO2, which accounts for 83% of emissions (by tons) in excess of the Minimum Emission Standards. Highest concentrations are found over the Mpumalanga Highveld where the majority of the large coal-fired power stations are located. Higher concentrations of between 1.23 and 1.34 mg/m3 are also found over the western parts of Limpopo province, where the large Matimba and Medupi Power Stations are located. Waterberg coal typically has higher sulphur content than Mpumalanga coal, accounting for the high SO2 emissions from these power stations.

Fig. 4
figure 4

Annual average PM2.5 concentrations due to emissions of SO2, NOx and PM in excess of the new plant emission limits from large coal-fired power stations, calculated using the Zhou et al. (2006) regression model after Myllyvirta (2014)

The proportion of total annual average PM2.5 concentrations contributed by coal-fired power stations ranges between 0.8 and 8.1%. Power stations make the greatest contribution in the Waterberg, where ambient PM2.5 concentrations derived from other sources are low, and the lowest contribution in the City of Johannesburg where there are large local sources of PM2.5.

Premature mortalities from total and power station-derived PM2.5 exposure

There are an estimated 23,628 premature deaths in the north-eastern parts of South Africa each year due to exposure to total ambient PM2.5 (Table 4). Of these, just over half are due to IHD. Strokes are responsible for the second highest number of premature deaths. Around 106 children under the age of 5 are estimated to die due to LRI per year.

Table 4 Estimated premature deaths per year in north-eastern South Africa due to exposure to total ambient PM2.5 concentrations and premature deaths attributable to exposure to PM2.5 derived from coal-fired power stations. Premature deaths per year due to coal-fired power station emissions calculated assuming a linear relationship between premature deaths and PM2.5 concentration are shown for comparison

There are 802 attributable premature deaths per year due to exposure to PM2.5 derived from coal-fired power stations, using the “average emitter” approach. If a linear relationship between premature deaths and exposure to PM2.5 is assumed for single-source deaths, 44% more premature deaths (1152) are calculated. The highest percentage difference in premature deaths calculated using the two methods is for LC premature deaths (128%) followed by IHD (74%), and the lowest percentage difference is for premature deaths from LRI for children younger than 5 years (25%) and stroke (14%). Assuming power stations reduce their emissions while all other sources stay the same, there will be 428 fewer premature mortalities a year (avoidable deaths).

As expected, the percentage difference in the number of premature mortalities calculated using the attributable and linear approaches increases at higher ambient PM2.5 concentrations (Fig. 5). In the study area, the relative difference between the two methods is greater than 50% at an annual average concentration of around 40 mg/m3. The use of the single-source (linear) method results in the greatest overestimates in areas such as the North-West Province and even parts of Gauteng, where secondary PM2.5 concentrations derived from coal-fired power station emissions are lower (Fig. 6). The number of premature deaths calculated with the two methods are more similar in areas closer to the power stations like the Mpumalanga Highveld and the Waterberg.

Fig. 5
figure 5

A comparison per ward of the number of attributable and single-source (linear) premature deaths due to power station PM2.5 exposure. The percentage difference between the number of deaths calculated using the two methods is represented by red dots and it is calculated as \(\left[\frac{V1-V2}{\frac{V1+V2}{2}}\right]\ast 100\)

Fig. 6
figure 6

Percentage difference in attributable and single-source (linear) premature deaths linked to secondary PM2.5 exposure emanating from coal-fired power stations. The insert indicates ambient PM2.5 concentrations estimated by the socio-economic regression model. The percentage difference between the two different approaches to applying the exposure-response function is calculated as \(\left[\frac{V1-V2}{\frac{V1+V2}{2}}\right]\ast 100\); positive percentage values indicate areas where the single-source approach estimates higher premature deaths than the attributable approach and negative values indicate the opposite

Discussion

Comparison of total ambient PM2.5 concentrations from regression model with other studies

The annual ambient PM2.5 concentration used for the premature mortality concentrations in this study are slightly higher than the PM2.5 concentrations found in other studies (Table 5). The higher total ambient PM2.5 concentration used in the calculation of premature mortalities lowers the number of deaths attributable to power station emissions because power station-derived PM2.5 contributes a smaller proportion of total PM2.5. Nevertheless, we suggest that a total ambient PM2.5 concentration on the high side may be a more accurate reflection of actual PM2.5 exposure levels, especially in low-income communities using solid and liquid fuels, where indoor particulate matter concentrations and personal exposure levels are measured to be consistently higher than ambient concentrations (Language et al. 2016; Wernecke et al. 2015).

Table 5 Comparison of total annual average ambient PM2.5 concentration over South Africa in previous studies and the methods used for the calculation

Accuracy of estimate of PM2.5 derived from coal-fired power station emissions

The estimates of PM2.5 concentrations derived from power station emissions appear to be in the correct order of magnitude, albeit a bit low, based on comparison with results of studies on the ionic composition of aerosol samples collected in the study area (Table 6). It is to be expected that the PM2.5 concentrations in this study are slightly low, given that only emissions in excess of the minimum emission standards were considered. Moreover, SO2 and NOx emissions can also increase the amount of ammonium PM since they typically interact with ammonia to form ammonium sulphate, ammonium bisulphate, and ammonium nitrate PM. Source apportionment studies have focussed on low-income residential settlements where domestic coal burning and vehicle emissions may also make a contribution to measured concentrations of sulphates and nitrates. Unfortunately, the PM composition studies have been performed at so few locations and for such short periods that it is not possible to validate the coal-fired power station PM2.5 concentrations with them.

Table 6 Comparison of source apportionment results from previous studies with estimates of PM2.5 concentrations from coal-fired power stations (as in Fig. 4)

PM2.5 concentrations derived from coal-fired power station emissions are a large uncertainty in this study, and better estimates are almost certainly provided in an unpublished study by van der Walt (2023). Van der Walt (2023) used a chemical transport model, the Comprehensive Air Quality Model with Extensions (CAMx), to determine the annual average PM2.5 concentration derived from Eskom coal-fired power plants. PM2.5 concentrations from all sources and from all sources excluding the power stations were simulated, and the power station-derived PM2.5 concentrations were calculated by difference. The modelling domain was the industrial Highveld only, much smaller than the domain in this study, and the model was run at fine resolution (around 2 km by 2 km). PM2.5 concentrations were higher than those calculated in this study near the power stations (ranging between just under 1 mg/m3 and just over 3 mg/m3 over the industrial Highveld, where this study calculates PM2.5 concentrations of around 1.8 mg/m3).

Comparison of coal-fired power plant premature mortalities with other South African studies

The number of premature deaths due to exposure to power station-derived PM2.5 concentrations calculated in different studies in South Africa varies by over an order of magnitude (Table 7). McDuffie et al. (2021) calculate by far the highest incidence of premature mortalities (0.043–0.060% of the exposed population per annum), followed by Marais et al. (2019) (0.016% of exposed population). The number of premature deaths calculated by McDuffie et al. (2021) is higher because they consider total power station emissions, not just those added after a certain year (as did Marais et al. (2019)) or emissions in excess of the new plant emission limits as the other studies in Table 7 did. It appears that the power station-derived PM2.5 concentrations are estimated to be higher by chemical transport models (used by McDuffie et al. (2021) and Marais et al. (2019)) than by atmospheric dispersion models. Chemical transport model calculations are expected to be more accurate because the models typically cover larger areas so recirculating flow is not “lost” from the model domain, and chemical reactions with emissions from other sources are also incorporated. The incidence of premature mortality calculated in this study is on a par with the incidence calculated in the other dispersion modelling studies. A review of other mainly non-peer reviewed studies of the health impacts of South Africa’s power station emissions is provided by Langerman and Pauw (2018).

Table 7 Comparison of the incidence of premature mortalities from power station-derived PM2.5 in the exposed population in South Africa for different studies

Conclusion

This study into the health risks of exposure to PM2.5 derived from emissions from coal-fired power stations in north-eastern South Africa has demonstrated that the method of applying the exposure-response function can introduce large uncertainties in the number of premature mortalities calculated. We argue that the average emitter approach, which assigns health impact based on the proportion of PM2.5 contributed to total ambient PM2.5 concentrations by the source under consideration is the most equitable way to attribute health impacts. Premature mortalities in north-eastern South Africa are calculated to be 44% higher when neglecting other sources of PM2.5 precursors and 53% lower when calculating avoidable deaths based on marginal reductions in PM2.5.

The best estimate of the number of annual premature deaths arising due to exposure to power station-derived PM2.5 in north-eastern South Africa calculated in this study is 802. The proportion of total annual average PM2.5 concentrations contributed by coal-fired power stations ranges between 0.8 and 8.1%. These figures are probably too low because PM2.5 concentrations derived from power station emissions calculated using the reduced-form intake fraction model are too low near the power stations (van der Walt 2023) and the use of ambient air quality monitoring stations sited mainly in highly populated areas biases the ambient PM2.5 concentrations to higher values. Also, premature mortalities will increase as the domain of the study is increased.

Nevertheless, this study has shown that the method of applying the exposure-response function can cause large uncertainties in premature mortality numbers. The contribution of coal-fired power stations in South Africa to PM2.5 concentrations calculated using global chemical transport studies (e.g. Marais et al. 2019; McDuffie et al. 2021) are probably too high. The best estimate of the health impact of emissions from South Africa’s coal-fired power stations will be obtained using chemical transport models to calculate ambient and power station-derived PM2.5 concentrations, considering all of southern Africa in the domain, and applying the exposure-response function using the “average emitter” approach.