Background

Lung cancer is the leading cause of cancer deaths globally [1]. The Global Burden of Disease Study estimated that 1.7 million people died from lung cancer in 2015 [1]. Although tobacco smoking acts as one of the major risk factors for the disease [2, 3], there is still a considerable fraction of lung cancer mortality that remains unexplained [4]. This is particularly noticeable in many high-income countries, which showed an apparent trend of decrease in the smoking prevalence [5]. Therefore, research in the past two decades has focused on environmental determinants of lung cancer [4, 6].

Petrochemical manufacturing industry, defined as petroleum refining (Standard Industrial Classification code [SIC] 2911) or industrial organic chemicals manufacturing (SIC 2869), involves processes that produce and potentially emit hazardous chemicals into the surrounding air, soil, and water. These petrochemical manufacturing factories are usually clustered in an industrial area together with other manufacturing processes or industry, such as steel, coking, and thermoelectric plants [7, 8], and called petrochemical industrial complexes (PIC). Several studies have detected environmental air pollutants near petrochemical manufacturing plants [9,10,11,12] and also after occasional fire accidents at petrochemical plants [13]. Long-term exposure to the poor air quality, as well as radon, chemicals, and arsenic compounds among residents living near petrochemical manufacturing complexes raised general awareness and the need to understand the possible adverse health effects among nearby residents [4, 14].

Several epidemiological studies have explored associations between the PICs and lung cancer risks of nearby people. Given high public concerns of health, the US started several investigations of suspected cancer risks for people living nearby PICs back to the 1970s [15,16,17]. For example, US white males living in petroleum industry counties had 1.10- to 1.17-fold higher risks of lung cancer mortality than males in other counties [17]. Subsequent studies in Italy and UK also revealed similar results with relative risks of 1.26 and 1.04, respectively, among white females [7, 8]. Fast-growing economies in Asia stimulated by the increasing demand for petrochemicals in manufacturing sectors also faced corresponding increases of lung cancer mortalities among residents nearby PICs [18]. However, several studies reported different results. For instance, Tsai and colleagues reported that male residents living in Louisiana’s Industrial Corridor had lower risks of lung cancer compared to other Louisiana citizens, even after adjusting for age [19]. Similarly, Simonsen and colleagues reported that the risk of lung cancer was not elevated significantly in accordance with the residence proximity to the industrial area [20].

Due to the inconsistent results, our study aimed to estimate lung cancer mortality risk associated with the PICs by combining cross-country data from different studies via a systematic review and meta-analysis.

Methods

Data source and study selection

We selected exclusively articles from PubMed, Cochrane Library, Web of Science, Science Direct, and other sources that published before July 11, 2017. We used "(Lung cancer OR Lung neoplasm) AND (Refinery OR Petroleum OR Petrochemical OR Oil and Gas Industry)" as the search term. Two researchers—HY Hung and RT Lin—selected independently articles that met the inclusion and exclusion criteria as below.

Inclusion and exclusion criteria

The inclusion criteria were: (1) original articles that clearly defined exposure group as residents living nearby PICs; (2) original articles that clearly defined lung cancer mortality according to International Classification of Diseases (ICD); (3) original articles that reported either confidence intervals (CI), standard errors (SE), or both; and (4) original articles that were written in English and full-texts were available. The exclusion criteria were: (1) studies with subjects overlapped with other publication; (2) studies that focused on occupational exposure in petrochemical plants only; and (3) studies that reported lung cancer incidence only and lack of mortality data.

Review process and data extraction

Figure 1 shows the selection process of the articles, including four steps: identification, screening, eligibility, and included. First, we identified 1249 articles from library databases and excluded 131 duplicated articles. Second, we screened articles by titles and abstracts. We chose 30 of them as relevant to our study objective for full-text review. Third, we carefully reviewed and checked whether those articles clearly defined exposure and health outcome and also reported estimates and CI or SE. Considering that a study population might appear in different articles, we selected the latest article to avoid bias towards the specific population. Finally, we included seven articles that reported 13 estimates for meta-analysis: three articles reporting sex-specific mortality rate ratios of lung cancer [7, 18, 21]; one article reporting sex-specific age-adjusted mortality rates of industrial corridors and Louisiana, respectively [19]; one article reporting odds ratios for both sexes combined [22]; one article reporting standardized mortality ratios by sex [8] and another one reporting standardized mortality ratios sex combined [23]. The ratio of Belli’s study was regarded as for males in subgroup analysis because male accounted for 84% of the study group [22].

Fig. 1
figure 1

Flow of systematic literature search on lung cancer mortality for residents living nearby petrochemical sites. N = number of studies; n = number of estimates included into meta-analysis; RR = relative risk (rate ratio or risk ratio); OR = odds ratio; SMR = standardized mortality ratio

Since lung cancer mortalities were less than 10−3 per year [24], we could appropriately interpret estimated odds ratios as relative risks [25, 26]. The adjusted standardized mortality ratios could be interpreted as relative risks as well because the estimates were derived from the comparison to general population in Rome [8, 27].

For the study not reporting CI or SE [19], we estimated the variances and SE of lnRR using following equations:

$$ Var(lnRR)= Var\left({lnR}_1+{lnR}_0\right) $$
$$ = Var\left({lnR}_1\right)+ Var\left({lnR}_0\right) $$
$$ ={\left(\frac{1}{R_1}\right)}^2\times Var\left({R}_1\right)+{\left(\frac{1}{R_0}\right)}^2\times Var\left({R}_0\right) $$
(1)
$$ SE(lnRR)=\sqrt{Var(lnRR)} $$
(2)

where Var(lnRR) represents the variance of natural log of relative risks (RRs); R1 and R0 represents mortality rates of the studied group and the reference group, respectively; and SE(lnRR) represents the standard error of natural log of relative risks.

Statistical analysis

We applied a random-effects model to examine whether there were within- and between-study heterogeneities using the I 2 test [28]. We set I 2 less than 10% as no heterogeneity, 10%–30% as low heterogeneity, 30%–60% as moderate heterogeneity, and more than 60% as high heterogeneity based on Cochrane handbook [29]. We further did subgroup analysis by different characteristics [30], including sex, location, ethnicity, PM10 standard, latency period (first year of study period more than 20 years after operation year of PICs vs. less or equal to 20 years), and bona fide observation (defined as 10 or more years of observation after 20 years of PIC operations vs. less than 10 years). Then, we applied meta-regressions to investigate the possible factors of heterogeneity, including sex, ethnicity, location, year of publication, and the starting year of follow-up. We also conducted sensitivity analyses to assess the influence of individual study on the overall RR by adding one estimate into the pooled estimates gradually. Finally, we used a funnel plot and the Begg’s and Egger’s regressions for asymmetry test to examine whether there was publication bias and small-study bias. All analyses were performed using the Stata Software version 11.2 (StataCorp, TX, US). We set the statistical significance level at 0.05, using a two-sided test.

Assessment of data quality

To assess the risk of bias in each study, the quality of each study was recorded and assessed using the Newcastle-Ottawa Quality Assessment Scale [31]. Records on data quality for each study were reviewed by CK Lin and HY Hung. We evaluated potential bias based on three categories (selection, comparability, and outcome) with eight measurements [31]. Although the discussion on the validity of the Scale remained inconclusive, the reliability of the Scale is quite fair and widely used in meta-analysis [32, 33].

Air quality standards

Pollutants emitted from PIC might vary over time, likely due to the change of manufacturing process and pollution control technology. Since data on air quality around studied petrochemical areas were limited, we reviewed national or regional ambient air quality standards in studied countries or regions: European Union (EU), Taiwan, and the US. We summarized three air quality standards for studied countries, including total suspended particles (TSP), PM10, and PM2.5 [34,35,36,37,38,39,40,41].

Results

Table 1 shows the basic characteristics of studies included in our meta-analysis. A total of 13 study groups were extracted, covering around 2,017,365 people living near petrochemical areas in Taiwan, Louisiana in the US, Teesside, West Glamorgan in the UK, and Brindisi, Sicily, and Rome in Italy. Seven out of 13 study groups reported RRs for males, five for females, and one for both sexes combined. The follow-up years ranged widely from 1960 to 2002. Most PICs operated at least 14 years.

Table 1 Basic characteristics of studies included in the meta-analysis

Figure 2 shows the pooled estimate of mortality risk for lung cancer among residents living nearby PICs. The estimated overall RR of 1.03 indicated that lung cancer mortality among residences might be associated with exposure to PICs, but it didn’t reach statistical significance (95% CI = 0.98–1.09). Although Belli’s study (study ID = H in Fig. 2) reported point estimate of lung cancer risk as high as 3.10, its broad CI ranging from 0.82 to 11.79 led to the smallest weighting factor of 0.17% in our meta-analysis. Among the selected studies, the highest weighting factor of 23.35% (study ID = K in Fig. 2) indicated that Michelozzi’s study on males in Rome contributed to the largest proportion of the pooled estimate, mainly because this study had the narrowest CI. The overall I 2 was 25.3%, indicating low heterogeneity existed among these studies.

Fig. 2
figure 2

Forest plot of studies on lung cancer risks of residents living nearby petrochemical industrial complexes. RR = relative risk

Table 2 shows the results of pooled estimates and 95% CI by different characteristics, including sex, location, ethnicity, PM10 standard, latency period, and bona fide observation. For each characteristic, there was no significant difference among pooled estimates between subgroups based on overlapping 95% CIs. However, we found a higher risk of lung cancer associated with residential exposure to PICs in the era of looser PM10 standard (RR = 1.12, 95% CI = 0.97–1.29 vs. RR = 1.01, 95% CI = 0.96–1.06).

Table 2 Pooled estimates of relative risks of lung cancer mortality for residents living nearby petrochemical industrial complexes, by different characteristics

Except for the starting year of follow-up, we did not find any possible heterogeneous factor from the meta-regression analysis. The slope of the meta-regression line suggested that for an increment in the starting year of follow-up, the RR of lung cancer would be 0.874-fold lower (p-value = 0.034, Fig. 3).

Fig. 3
figure 3

The relationship between natural log of relative risk of lung cancer mortality and starting year of follow-up. ln(RR) = natural log of relative risk

Figure 4 shows the sensitivity analysis for the effect of individual study on pooled results. We gradually added each study into the sensitivity analysis—from studies published in the earlier period to studies published in the later period. None of them significantly affected the pooled results. There was no significant publication bias among the studies for 13 study groups (Egger’s test: p-value = 0.059; Begg’s test: p-value = 0.051). The funnel plot also indicated no asymmetry for the estimates for the 13 study groups was observed (Fig. 5).

Fig. 4
figure 4

Sensitivity analysis of random effects estimates after adding each additional study according to the publication year. RR = relative risk

Fig. 5
figure 5

Funnel plot for lung cancer mortality relative rates associated with residential exposure to petrochemical industrial complexes of the 13 study groups. ln(RR) = Natural log of relative risks; SE of ln(RR) = standard error of natural log of relative risks

Additional file 1 listed details of the quality assessment for cohort and case-control study, respectively. All studies reported sex-specific, age-adjusted point estimates. Some studies further adjusted ethnicity, socioeconomic levels (e.g., school levels, job collars categories, unemployment, number of family members, overcrowding, and ownership of dwellings), or study periods. Four studies had full score of nine stars [18, 21,22,23]; two had 8 out of 9 stars [7, 8]; and one study had seven out of nine stars [19] (see Additional file 2).

Air quality standards in the EU, Taiwan, and the US were summarized in Fig. 6. The earliest standard of ambient air quality was for TSP, followed by PM10 and PM2.5. All countries have set stricter air quality standards over the years. For example, the standard for annual average TSP concentration was 150 μg/m3 in the EU in 1983. The EU tightened the regulation by setting up annual PM10 standard at 60 μg/m3 in 1996, and then lowering it to 40 μg/m3 in 1999. In 2008, the EU set up the annual PM2.5 standard at 25 μg/m3. Similarly, the US set the annual TSP standard at 75 μg/m3 in 1971, and further tightened the limits to 50 μg/m3 in 1987. In contrast, Taiwan adopted the US’s 1971 standard for TSP and PM10 and announced the regulation in 1992, but the limits have not been changed since then.

Fig. 6
figure 6

Historical air quality standards of studied regions. TSP = total suspended particles; 1’ = primary pollutant; 2’ = secondary pollutant

Discussion

To our best knowledge, this is the first meta-analysis that estimated the pooled RR of lung cancer mortality for residents living nearby PICs. We aggregated lung cancer risks for 13 study groups from seven published papers in the US, the UK, Italy, and Taiwan. Based on these studies, people living in the PICs had higher lung cancer mortality risks than residents in non-PICs by a factor of 1.03, despite such associations didn’t reach statistically significant (95% CI = 0.98–1.09). Stratification analysis by different characteristics, such as sex and ethnicity, did not change the magnitude of this association. In contrast, the starting year of follow-up affected the association between lung cancer mortality and exposure to PICs by a factor of 0.874. That is, the estimated risk of lung cancer mortality was higher among subjects recruited in earlier periods, and the risk decreased by 12.6% if the year of follow-up started 1 year later.

The scientific evidence of the study is sound and solid from several perspectives. First, the outcome variable was based on pathological samples and/or the ICD-9. Individual data were obtained by linking to governmental database. Second, the large sample size (n = 2,017,365) and diverse populations (e.g., by sex, ethnicities, and locations) made the pooled estimate more representative and enhanced the generalizability. Third, by applying the random-effect model, we were able to address the heterogeneity between studies and further reported the pooled effects.

We found higher lung cancer mortality risks among residents near PICs by a factor of 1.03, although this adjusted RR did not reach statistical significance. We identified the following possible limitations of the study. First, the definition of exposure varied slightly between studies. Most studies defined the exposure based on the geographical locations or distances of residencies from PIC [7, 8, 19, 21,22,23], while one study compared the exposed group and reference group by matching job categories in PIC and non-PIC towns [18]. Misclassification of exposure and non-exposure might exist and bias the pooled estimates towards the null. Second, the operation of PICs started as early as 1960 and some PICs are still in operation. Exposure to pollutants emitted from PICs might be quantitatively and qualitatively different in each period. Third, although our subgroup analysis didn’t show different risks for residents in different latency periods, still not everyone in the selected studies had sufficient latency periods or adequate follow-up period. The estimations on latency period for lung cancer diagnosis varied widely but usually required approximate years to decades [42, 43]. Inadequate inclusion of residents with insufficient latency might bias the result toward the null in the original studies.

An effective air quality intervention involved a series of steps, including regulatory establishments, pollution reductions, and anticipated improvements in health [44]. Although data on ambient pollution monitoring around PICs in the early periods were very limited and hard to obtain, previous studies have documented pollution reductions could be attributable to changing regulations [45, 46]. We could reasonably assume that most petrochemical factories followed the local regulations to some extent. Therefore, the historical air quality standards for TSP, PM10, and PM2.5 could reflect the relative trends of exposure to air pollutants emitted from PICs. Most air quality standards became stricter over the years [34,35,36,37, 41]. Such trend partially explains our findings in the heterogeneity regression; that is, studies on populations with earlier exposure to PICs were associated with significantly higher risk of lung cancer mortality.

There are some limitations need to be addressed when interpreting our results. First, not all potential confounders were adjusted in the seven articles, such as smoking, radon exposure, meteorological factors, and socioeconomic status. However, these unadjusted confounders posed an unknown or even lower risk of lung cancer to the exposure group compared to the reference group. For example, the smoking rate of exposure group was lower than the reference group in Bhopal and colleagues’ study [7]. Similarly, people lived in the Industrial Corridor had higher socioeconomic status (less unemployed, higher income, and higher educational attainment) compared to the average of Louisiana [19]. Since lower smoking rate and higher neighborhood socioeconomic status were associated with fewer lung cancer incidence [47, 48], health benefits from the improvement of socioeconomic status along with industrial development were likely to outweigh the negative effects of exposure to the petrochemical industry. The data on radon exposure, as one of the risk factors of lung cancer, were absent in all selected papers. However, there is no evidence of higher radon exposure in PIC areas than non-PIC ones [49, 50]. Similarly, seasonal variations of wind directions might either increase or decrease the effect of PIC exposure on residents’ health. Since all studies have exposure and reference groups from both upwind and downwind locations, subject-selection bias and meteorological effects due to location variance were reduced. Second, studies with available data for meta-analysis were originated from the US, the UK, Italy, and Taiwan. The generalization of the impact of petrochemical industry on lung cancer might be restricted to these countries. However, these four countries represented the majority of countries with the largest petrochemical industries in terms of ethylene production capacity [51], the major base of petrochemicals and a common index to estimate production capacity of a petrochemical company. Third, each PIC might involve other manufacturing processes (such as steel, cocking, and power plants) and the exposure level could also be affected by geographical factors across different countries. Limited by the lack of corresponding exposure data, our findings were not able to address the heterogeneity between PICs. Fourth, certain portion of residents living nearby PICs might risk occupational exposure as well. Some studies have separated the environmental exposure from the occupational exposure (study ID = A, B, G, H) or at least considered job categories in the analysis (study ID = M) to reduce the influence of occupational exposure.

Conclusions

Our meta-analysis gathering current evidence suggests only a slightly higher risk of lung cancer mortality among residents living nearby PICs. Our analysis also underline the role of stringent regulations on improving air quality and reducing the residential exposure to air pollution, which can further contribute to lowering the risk of lung cancer.