Concentrations, health risks, and sources of hazardous air pollutants in Seoul-Incheon, a megacity area in Korea

We conducted ambient monitoring of various hazardous air pollutants (HAPs) for 2 years (2013-2015) in two adjacent Korean cities in a megacity area: Seoul and Incheon. Measured HAPs included volatile organic compounds (VOCs), polycyclic aromatic hydrocarbons (PAHs), and heavy metals (HMs). The objectives of this study were to evaluate the spatiotemporal variations of HAPs, to prioritize HAPs based on health risks, to identify sources using a receptor-based model, and to estimate source-specific risks. Overall, the HAP levels in Incheon were higher than those in Seoul. The concentrations of combustion-origin HAPs, such as PAHs and some HMs, were significantly higher during the heating period than during the non-heating period. However, most VOCs exhibited an opposite trend. Benzo[a]pyrene showed the highest cancer risk in both cities, followed by formaldehyde, arsenic, and benzene; trichloroethylene was the only species that exceeded the hazard quotient of 1. Cumulative cancer risks were 2.0 × 10-4 in Seoul and 2.7 × 10-4 in Incheon. Major sources and their contributions to each HAP concentration were estimated by positive matrix factorization modeling. Based on source-specific risk assessments, we suggest that both cities should give high priority to the control of traffic pollution and the supply of cleaner fuels in non-residential sectors. Reducing carbonyl concentrations in Seoul and industrial emissions in Incheon is also necessary. Establishing new ambient standards for benzo[a]pyrene and formaldehyde is worth considering as a long-term measure. This study provides scientific information on the occurrence, health risks, and sources of various HAPs in large urban areas.


Introduction
Air pollution is recognized as a public health threat and is an inevitable consequence of fossil fuel consumption for energy production and industrial activities Samet and Gruskin 2015). East Asia is seriously affected by air pollution, particularly ambient particulate matter (PM) Park et al. 2019;Parrish and Zhu 2009). The association of environmental factors with cancer in humans (IARC 2018;USEPA 2019;WHO 2000) has focused attention on the adverse effects of hazardous air pollutants (HAPs, or air toxics) in urban and industrial areas (Kim et al. 2020;Scheffe et al. 2016;Wu et al. 2011).
There is no clear definition of HAPs; some characteristics of HAPs are (i) HAPs include many genotoxic and carcinogenic chemicals (WHO 2000); (ii) most HAPs are nonthreshold pollutants that affect health from prolonged exposure to low concentrations (Patrick 1994); (iii) some HAPs exert non-carcinogenic respiratory effects, environmental impact, and bioaccumulation (USEPA 2019). Toxic chemicals that are classified as HAPs include volatile organic compounds (VOCs), polycyclic aromatic hydrocarbons (PAHs), heavy metals (HMs), phthalates, dioxins/furans, and organochlorine pesticides (IARC 2018;USEPA 2019).
The lists of toxic substances or the groups of chemicals designated as HAPs vary from country to country, depending on regulatory, social, and industrial conditions. The US Environmental Protection Agency (USEPA) identified 187 air toxics under the Clean Air Act and defined 33 HAPs as urban air toxics (USEPA 2017). The Canadian Environmental protection Act listed some of the 187 HAPs named by the USEPA as "toxic substances" (Bari and Kindzierski 2017). The Japan Ministry of Environment listed 248 toxic chemicals as specific airborne hazardous substances and selected 23 of them as priority pollutants (Baek et al. 2020). The Korean Clean Air Act defines HAPs as specific air toxics that adversely affect the public by prolonged exposure to low concentrations. The list of HAPs in Korea expanded from 16 in 1978 to 35 in 2007. Because of these low numbers, the Korean policies for managing HAPs have been criticized as being inferior to those in the US and Japan (Kim et al. 2020).
In general, air pollutant emissions in densely populated urban areas (often called megacities) are high due to the density of traffic and intensity of energy use (Baklanov et al. 2016;Gulia et al. 2015). The environmental significance of urban air toxics is justified not only by the size of the exposed population but also by the diversity of emission sources relative to non-urban areas. Furthermore, continuous exposure to multiple toxics from stationary and mobile sources increases the public health risks. Risk assessment of urban air toxics is procedurally complex and requires a broad scope of data input. Consequently, monitoring HAPs in ambient air provides public exposure data that is crucial to the process.
Seoul is the largest and most densely populated city in South Korea, where approximately 10 million people inhabit only 0.6% of the total area (100,210 km 2 ). Incheon is a port city west of Seoul and is the third-largest city in Korea, with a population of 3 million. Although Seoul and Incheon are separate administrative districts, they are adjacent to each other, forming a daily life zone. The two cities are referred to as the Seoul-Incheon Megacity Area (SIMA), where 25% of the Korean population lives. In Seoul, the number of industrial areas is small because, for the past 40 years, industries that produce pollution have been forced to move to the Sihwa-Banwol national industrial complexes (SBNIC),~30 km southwest of Seoul. As of 2015, the SBNIC comprised more than 10,000 businesses, and is one of Korea's largest industrial areas . In contrast to Seoul, many sources of air pollution exist within the territory of Incheon, such as ports, airports, industrial complexes, and power plants. In addition, residential areas are close to these industrial complexes and are downwind.
One objective of air quality management is to reduce health risks caused by air pollution (Gulia et al. 2015). Although effectively managing all toxic pollutants is virtually impossible, identifying and prioritizing the management of pollutants that pose the most significant risks to residents, particularly in megacities, is an effective approach to reduce health risks. Numerous studies on air quality have been conducted in Seoul during the last three decades. However, most of these studies focused on criteria pollutants (e.g., SO 2 , NO x , O 3 , and PM), and few studies addressed issues related to HAPs . Although several studies on VOCs (Na and Kim 2001;Na et al. 2002) and PAHs (Bae et al. 2002;Kim et al. 2013;Lee et al. 2011) have been conducted in Seoul, measurements were taken sporadically and were limited to a small number of toxic pollutants. In addition, the air quality in Incheon has been ignored. Therefore, comprehensive monitoring of a wide range of HAPs has been proposed as an important and urgent task to determine ambient concentrations and recommend measures for protecting public health Parrish and Zhu 2009).
In this study, we carried out an intensive monitoring project from 2013 to 2015 at seven sites in Seoul and Incheon; these sites represent typical non-industrial areas of each city. More than 100 toxic chemicals were measured at each site, including VOCs, PAHs, and HMs. The data were then used to identify the sources and to prioritize the HAPs that pose the greatest health risks. The specific objectives of this study were (i) to investigate the occurrence, concentration, and spatiotemporal variation of HAPs in the ambient air of Seoul and Incheon; (ii) to prioritize HAPs based on health risk assessments; (iii) to identify major sources of HAPs using positive matrix factorization (PMF) modeling; and (iv) to apportion the health risks into contributing sources.

Materials and methods
Sampling periods, sites, and weather conditions Sampling was conducted in seven seasons over 2 years from 2013 to 2015. The monitoring protocols in the first and second years differed with respect to sampling location, period, and frequency. In the first year (from August 2013 to February 2014), samples were collected in three seasons (summer, autumn, and winter) at three sites in Seoul for ten consecutive days per season. In the second year (from July 2014 to March 2015), sampling was conducted at four sites in Incheon for seven consecutive days per season over four seasons. In Incheon, we added one additional sampling site in consideration of the city's topographical complexity and the diversity of emission sources. Since this study focused on the occurrence of HAPs in the ambient atmosphere exposed to the general public, the selection of sampling sites in each city was determined by considering the spatial distribution of nonindustrial areas. The location of each sampling site is shown in Fig. 1; details about the sampling periods in each year and the number of samples collected during the seasonal campaigns are presented in Table 1. Most air samples were collected on rooftops (10-15 m above the ground); one site was on a roadside. The details of the situation around each sampling site are described in Text S1 (Supplementary Materials). Hourly meteorological data during the sampling periods were obtained from the automatic weather stations in Seoul and Incheon, and the data are summarized in Table 1. Seasonal variations in the prevailing winds showed distinct patterns: SE-E winds in summer and NW-W winds in other seasons, indicating that this area is frequently affected by westerly winds blowing from the continent, particularly during cold seasons. Rainfall was observed in every sampling period, and a larger amount fell in summer than in other seasons, which is typical for South Korea.

Measurement of vapor phase HAPs
VOCs The VOC measurement method was based on the USEPA TO-17 protocol, which has been used in several field studies (Baek et al. 2015(Baek et al. , 2020aKim, et al. 2020;Seo et al. 2014). VOCs were collected twice per day (09:00−12:00 and 14:00−17:00) by drawing air through a stainless-steel tube (0.6 mm x 9 cm, Perkin Elmer, UK) packed with double sorbents: 120 mg and 280 mg of Carbograph-2TD and Carbograph-1TD (40/60 mesh, Markes, UK) in the front and back positions, respectively. The flow rate was set at 100 mL/ min per sample using a personal pump (FLEC Air Pump 1001, Chematec, Denmark). The sampling flow rate and duration were selected to avoid any breakthrough phenomena with a margin of safety. VOC analysis was performed by thermal desorption (Unity/Ultra, Markes, UK) and gas chromatography/mass spectrometry (GC/MS; HP6890/5973, Hewlett-Packard, USA). The operating conditions and specifications of thermal desorption and GC/MS analysis are described in detail in Text S2. A standard gas mixture of 62 VOCs for USEPA TO-15/17 (1 ppm, Supelco, USA) was used for identification and calibration purposes. In addition, two deuterated compounds (benzene-d 6 and ethylbenzened 10 , Rigas, Korea) were used as internal standards (IS). A self-manufactured gas spiking apparatus ( Fig. S2) was used to prepare standard tube samples and spike the IS to the field samples. Among the 62 VOCs, five highly volatile compounds (propylene,ethanol,Freon 21,chloromethane,and Freon 114) were excluded from the target analytes because they were not efficiently collected by the adsorbent tubes.
The degree of uncertainty for the VOC measurements was evaluated with respect to the analytical precision, method detection limit (MDL), and duplicate precision. Detailed data regarding quality control (QC) and quality assurance (QA) are presented in Table S1. Briefly, the analytical precision of target VOCs ranged from 3.0 to 18.4%. The MDLs for each VOC were evaluated according to the USEPA guidelines (USEPA, 2016a) and estimated to range from 0.02 (benzene) to 0.12 ppb (vinyl chloride). The mean duplicate precision (MDP) values for most VOCs were within the TO-17 recommended value of < 30%.
Carbonyls Carbonyls were sampled with the same frequency and duration as VOCs. Samples were collected on 2,4dintrophenylhydrazine (DNPH) cartridges (LpDNPH S10L, Supelco, USA) at a flow rate of 0.8 L/min for 3 h using personal pumps (SKC, USA). Ozone scrubbers were placed in front of the DNPH cartridges to avoid sampling artifacts. Carbonyls were extracted with 3 mL of acetonitrile, and then analyzed by high-performance liquid chromatography (LC-20A HPLC system, Shimadzu, Japan) with a standard mixture of 13 carbonyls (Carbonyl-DNPH Mix 1, Supelco, USA). Details of the carbonyl analytical method are presented in Text S3 and are reported in the literature (Baek et al. 2020a, b;Seo et al. 2014). The analytical precision for all carbonyls tested was < 2.5%. The carbonyl MDLs ranged from 0.02 to 0.03 ppb, and the MDP values were all < 30%.

Measurement of particulate HAPs
Collection of particle samples High-volume samplers (TE-PNY1123, Tisch Environmental, USA) were used to collect total suspended particles (TSP). The samplers were operated for 24 h at a flow rate of 550-600 L/min. Quartz fiber filters (8″ × 10″, QMA filter, Whatman, USA) were used for particle collection and were pretreated at~400°C for 4 h to remove any organic impurities. The pretreated filters were equilibrated for 24 h in a desiccator (20 ± 1°C and 45 ± 5%) and weighed before sampling. Each filter was divided into four pieces of equal size. Two pieces were used for PAH analysis, and the other two were used to determine the TSP and trace elements.
PAHs The protocol for PAH measurement was based on the USEPA TO-13A method. PAHs were extracted from the two filter pieces using a solvent extraction system (Soxtec 2055, Foss, Switzerland) with 80 mL of an acetone:hexane solution (1:9, v/v) for 40 min at 145°C, and then rinsed at a rate of 40-50 times/h for 80 min (Baek et al. 2020a). Before extraction, the filters were spiked with an aliquot of a mixture of five deuterated surrogate standards (SS). The extracted samples were initially concentrated to~4 mL using a vacuum evaporator (RapidVap, Labconco, USA) under N 2 gas, and then concentrated to 0.5 mL after being passed through an anhydrous Na 2 SO 4 cartridge. The concentrated samples were then spiked with a mixture of four internal standards (IS) and analyzed by GC/MS (6890N/5973i, Agilent Technologies, USA) in selected ion monitoring mode. The details of SS, IS, and GC/MS operating conditions are described in Text S4. The standard reference material (SRM) 2260a (Aromatic Hydrocarbons in Toluene, NIST, USA), a mixture of 35 PAHs, was used as the working standard for the qualitative and quantitative analysis of the target analytes. PAH concentrations were determined by the IS method, and each concentration value was corrected by the recovery rate of the corresponding SS.
The PAH measurements were also evaluated with respect to analytical precision, MDL, and recovery rate, according to the USEPA TO-13A protocol. The detailed QC/QA data for PAHs are presented in Table S2. The analytical precision for the 35 PAHs ranged from 0.5% (pyrene) to 25.2% (coronene). The correlation coefficients for the linearity of the calibration curves (four points) exceeded 0.99. The MDLs for the PAHs ranged from 0.02 ng/m 3 for benzo[a]pyrene (BaP) to 0.12 ng/ m 3 for dibenzo[a,e]pyrene (DaeP). The recovery rates and accuracies of the PAH data were evaluated using the SRM1649b (Urban Dust, NIST, USA). The recovery rates for the 23 PAHs having 3-7 rings ranged from 70.5 to 104.0% with a mean value of 82.8%. The accuracies were expressed as mean relative errors (MRE) and ranged from 4.0 to 29.6%. Although two-ring PAHs, such as naphthalene and dibenzothiophene, were measured from the particle samples, these data were discarded because of the low recovery rates (< 70%) and high MRE values (> 30%).
Trace elements Trace elements were extracted from the piece of each filter sample with a solution of HCl (16.75%, v/v) and HNO 3 (5.55%, v/v) using a microwave digester (Ethos, Milestone, Italy), according to the USEPA IO-3.1 method. The extracted samples were analyzed using inductively coupled plasma/atomic emission spectrometry (ICP/AES; Optima 3000RL, Perkin Elmer, USA). A total of 16 trace elements (Al, As, Ca, Cd, Co, Fe, K, Mg, Mn, Na, Ni, Pb, Se, Ti, V, and Zn) were determined using the ICP multielement standard solution IV (Merck, Germany) and individual standards. Hexavalent chromium (Cr(VI)) was measured only in Incheon during the second year by low-volume sampling according to the method established by the USEPA (2006). Details of the sampling and analytical procedures for Cr(VI) are presented in Text S4. The QC/QA data for the ICP method are presented in Table S3. The analytical precision for the trace elements ranged from 2.3% (Pb) to 25.4% (Se). The MDLs ranged from 0.05 ng/m 3 (Cd) to 10.93 ng/m 3 (Na). The recovery rates of trace elements were examined using SRM1648 (Urban Particulate Matter, NIST, USA). The soilassociated elements, such as Al, Ca, Fe, K, Mg, Na, and Ti, had recovery rates that ranged from 45.3% to 86.1%. However, heavy metals showed good recovery rates, ranging from 80.3% (V) to 108.5% (Pb); one exception was Se (25.9%). The Se data were not included in the risk assessment owing to poor recovery and low detection frequency (DF), both of which were less than 30%.
Organic carbon (OC) and elemental carbon (EC) Atmospheric particulate carbons (OC and EC) were determined from PM 2.5 samples, which were collected by low-volume sampling at a flow rate of 16.7 L/min for 24 h. A quartz fiber filter (4.5 cm dia.) was attached to a filter pack with a cyclone inlet for PM 2.5 (URG, USA). The OC and EC were analyzed by the thermal optical transmittance method (OC-EC Aerosol Analyzer, Sunset Lab. Inc., USA), and measured only in Seoul during the first year of the sampling period.

Positive matrix factorization modeling
PMF is a receptor model that estimates the contribution of each source to the measured concentrations of chemical species based on source compositions, which are also called fingerprints. The goal of using a receptor model is to obtain the chemical mass balance between the measured species concentrations and source profiles. In the PMF model, individual species measured in each sample are expressed as the sum of the contributions from identified sources as follows: where x ij is the concentration of the jth species (among m species) measured in the ith sample (among n samples); p is the number of independent sources contributing to the samples; f jk is the concentration of the jth species in the kth source profile; g ki is the relative contribution of the kth source to the ith sample; e ij is the error term of the PMF model for the jth species measured on the ith sample. The source profiles and contributions are derived by the PMF model in a manner that minimizes the objective function Q, which is defined as where u ij is the uncertainty of the jth species concentration in the ith sample. We used the USEPA PMF 5.0 to identify potential sources of VOCs, PAHs, and HMs. The uncertainty was calculated according to the User Guide of the PMF 5.0 (USEPA 2014), which states the following: If the concentration is less than the MDL of a species, then the uncertainty is replaced by a fixed fraction of the MDL (5/6 × MDL). If the concentration is greater than the MDL, then the uncertainty is calculated as [(error fraction × concentration) 2 + (0.5 × MDL) 2 ] 1/2 . According to Wu et al. (2009), the mean sourcespecific concentrations (x * jk ) of individual HAPs can be calculated as follows: where x * jk is the mean concentration of the jth species from the kth source, f jk , which is also used in Eq. (1); g * k is the mean contribution from the kth source. In this study, the calculated x * jk values were used to estimate the health risk caused by an identified source.

Toxicity information and health risk assessment
In this study, potential health risks were evaluated only for chronic exposure to air toxics via inhalation. To assess these risks, we applied a point estimation approach for a screening-level health risk assessment that has been adopted in several studies Kindzierski 2017, 2018;Fox et al. 2004;Jia and Foran 2013;Xiong et al. 2020;Wu et al. 2009Wu et al. , 2011.
The toxicity values applied in this study are presented in Tables S4-S6. The cancer risk (CR) for a specific HAP was calculated by its mean concentration (x * ij ) and the corresponding inhalation unit risk (UR) factor, and the non-cancer risk was estimated by the hazard quotient (HQ) as follows: where CR j , HQ j , UR j , and RfC j are the mean cancer risk, mean non-cancer risk, unit risk, and reference concentration (RfC) for the jth species, respectively. The UR values were obtained from the USEPA (2018a, b, c), the California EPA (Cal EPA 2018), and the World Health Organization (WHO 2000). The RfC values were also obtained from the USEPA (USEPA 1997(USEPA , 2018b. When no RfC was available in the USEPA database, we used the reference exposure levels (REL) provided by the California EPA (Cal EPA 2018) or minimal risk levels (MRL) reported by the US Agency for Toxic Substances and Disease Registry (ATSDR 2016) as a substitute for the RfC. If two or more toxicity values differed for any species, the more conservative value was taken. HAPs for which no UR, RfC, REL, or MRL value was available were not included in the risk calculation. Risk apportionment of identified sources was determined by calculating source-specific cancer and non-cancer risks as follows: where CR k and HQ k are the cancer and non-cancer risks, respectively, posed by the kth source, and x * jk is defined in Eq. (3). For PMF modeling and risk assessment, the unit of measurement for VOCs was converted from ppb to μg/m 3 (at 20°C, 1 atm).

Spatial distributions of HAP concentrations
The measured concentrations of selected HAPs (DF > 30%) in Seoul and Incheon are summarized in Table 2, and detailed data for each site in the two cities are presented in Tables S7 (Seoul) and S8 (Incheon). In this study, the selection criteria were based on three factors: abundance (concentration), ubiquity (DF), and toxicity (carcinogenic or non-carcinogenic impact). Benzene, formaldehyde (FALD), and BaP, which are well-known carcinogens (USEPA 2018a; IARC 2018), were detected in all samples in both cities. The most abundant VOC and PAH compounds were toluene (citywide average: 4.79 ppb in Seoul and 4.88 ppb in Incheon) and fluoranthene (FLRTH; 1.59 ng/m 3 in Seoul and 2.59 ng/m 3 in Incheon), respectively. Within the categories of VOCs, PAHs, and HMs, the distribution of the HAPs detected was broad and variable; the variability in the HAP concentrations reflected the variability in the sources and atmospheric behavior of each species. However, the variability across the sampling sites in each city was relatively small (Tables S7 and S8). Analysis of variance (ANOVA) indicated that, among the 60 species, a statistically significant difference (p < 0.05) in mean concentrations between different sampling sites was observed for 10 and 17 species in Seoul and Incheon, respectively. Two reasons for the lower variability are that all sampling sites were located in non-industrial areas and that the sampling sites in each city were affected by the same airshed.
Overall, the mean concentrations of HAPs were higher in Incheon than in Seoul. Because HAPs were measured in Seoul and Incheon at different times, a direct comparison between the two cities may not be appropriate owing to the lack of concurrency. Therefore, we evaluated air quality by focusing on the levels of criteria pollutants (SO 2 , NO 2 , CO, and PM 10 ) and determining whether significant changes in air quality occurred from 2013 to 2015. The criteria pollutant data were available from Air Korea (2020). The monthly variations in the ambient levels of the four criteria pollutants are presented in Fig. S2, and t test revealed that there were no significant changes (p > 0.05) between the first and second year during the sampling periods of this study. The only exceptions were the PM 10 levels in February and March, 2015, when Asian dust events occurred. Therefore, we compared the HAP levels between the two cities by assuming that the overall air quality in each city from 2013 to 2015 was equal. A comparison of the cumulative probability distributions of the data for the major HAPs between the two cities is shown in Fig. 2. Among VOCs, the concentration of trichloroethylene (TCE) was higher in Incheon than Seoul (p < 0.05), but that of methyl tert-butyl ether (MTBE) was higher in Seoul. Levels of other VOCs, such as benzene, toluene, xylenes, and FALD, were similar between the two cities (p > 0.05). TCE is a chlorinated organic solvent widely used for the vapor degreasing of metal parts (USEPA 2016b). MTBE is a good indicator of vehicle exhaust because it is used in gasoline as an anti-knocking agent (Hsieh et al. 2006). The levels of PAHs and HMs were higher in Incheon than in Seoul. The results suggest that the Incheon area is more affected by industrial activities than Seoul, despite its smaller population. The concentration ratios of some specific PAHs have been widely used to characterize and identify potential emission sources of PAHs (Ravindra et al. 2008;Tobiszewski and Namieśnik 2012;Yunker et al. 2002). Typical PAHs used in the diagnostic ratios to differentiate pyrogenic combustion from petrogenic sources are FLRTH and pyrene, benzo[A]anthracene (BaA) and chrysene, and indeo[1,2,3-c,d]pyrene (I123P) and benzo[g,h,i]perylene (BghiP). Spatiotemporal distributions of the diagnostic ratios for FLRTH/(FLRTH+pyrene), BaA/ (BaA+chrysene), and I123P/(I123P+BghiP) from this study are summarized in Table S9, while cross plots of FLRTH/(FLRTH+ pyrene) and I123P/(I123P+BghiP) are presented in Fig. S3. According to the diagnostic ratios data, it can be seen that petroleum fuel combustion, traffic-related emission, and biomass/coal combustion are the main affecting factors contributing to the atmospheric PAH concentrations in the Soul-Incheon area. However, the diagnostic ratio method is still very uncertain and has many difficulties in distinguishing similar sources (Ravindra et al. 2008;Tobiszewski and Namieśnik 2012). Therefore, it seems to be the best to use the method as a screening tool for approximate identification of PAH sources.

Relationship between HAP concentrations and ambient temperature
Some HAPs are correlated with industrial activities, and others are correlated with fossil fuel combustion Kim et al. 2020). Accordingly, the seasonal variability of HAPs may be different for individual species. In general, industrial activities are expected to be constant throughout the year, but the use of fossil fuels will vary depending on the ambient temperature. Therefore, instead of four seasons, we divided the data into two groups for each city: heating (from November through March) and non-heating periods (from April through October), based on the annual average temperature of 14.5°C in the SIMA. The average concentrations of selected HAPs measured during the heating and non-heating periods in Seoul and Incheon are shown in Fig. 3. Three distinct patterns were observed: (i) VOCs (except benzene and carbonyls) exhibited no statistically significant differences between the two periods; (ii) combustion-related HAPs, such as benzene, PAHs, As, and Cd, exhibited higher levels during the heating period than during the non-heating period; and (iii) carbonyl concentrations increased during the non-heating period; this increase was partially due to photochemical reactions in the atmosphere that occur during summer (Kim et al. 2020;Scheffe et al. 2016).
Seasonal variations in PAH concentrations occur in temperate zones, depending on the magnitude of fossil fuel combustion (Dat and Chang 2017;Li et al. 2009; Thang et al. . PAH levels in the SIMA were typically 3-10 times higher during cold seasons than during warm seasons. The increased levels of PAHs in winter occurred for several reasons: increased fossil fuel consumption, enhanced vapor-toparticle transformation, and cold-start emissions from automobiles (Cheruiyot et al. 2015;Dat and Chang 2017;Ravindra et al. 2008). In addition, transboundary pollution from China and North Korea, can contribute to PAH levels (Kim et al. 2013;Lee et al. 2006;Thang et al. 2019). Coal is widely used for space-heating in the northern parts of China and North Korea from October to March (Kim et al. 2013;Lee et al. 2006Lee et al. , 2011, which increases PAH emissions. Therefore, the PAH levels in the SIMA are affected not only by local sources but also by PAH emissions in China and North Korea from prevailing winds. Similar trends have been reported in Seoul (Kim et al. 2013;Lee et al. 2011) and on the west coast of Korea (Baek et al. 2020a;Lee et al. 2006;Thang et al. 2019). To validate our hypothesis about transboundary transport of PAHs during cold seasons, we conducted hybrid single-particle Lagrangian integrated trajectory (HYSPLIT) modeling (HYSPLIT, 2019) for February 5 and 7 as days when the maximum (35.44 ng/m 3 ) and minimum (5.54 ng/ m 3 ) concentrations of Σ35PAHs, respectively, were measured during the winter campaign in Seoul (Fig. S4). The back trajectories of air parcels that arrived in Seoul on February 5 were through China and North Korea, whereas those on February 7 were through the East Sea from Siberia. Unexpectedly high levels of Mn were observed during the heating period in Incheon (Fig. 3). Asian dust was observed in the SIMA for five days peaking on March 21, 2015. In March 18, just before the Asian dust invasion, the average level of TSP in Incheon was 102.4 μg/m 3 , but it increased 3.7 times (reaching 375.2 μg/m 3 ) and then returned to 106.5 μg/m 3 on March 24. During the same period, Mn levels increased from 54.1 to 233.0 ng/m 3 and then decreased to 63.6 ng/m 3 after the Asian dust event. The iron-steel industry is a major source of Mn emissions (Baek et al. 2020a), but Mn is also in the soil (Chow 1995), which is the primary component of Asian dust. Therefore, the levels of Mn during the heating period in Incheon are attributed to Asian dust. This conclusion was also validated by HYSPLIT modeling for March 18 and 21 (Fig. S5). The back trajectories show that air parcels arrived at Incheon through the East Sea on March 18, whereas they reached Incheon through the Yellow Sea from China on March 21. During the same period, there were no significant changes in the levels of PAHs and VOCs.

Assessment of health risks from HAP exposures
The data measured at each site were pooled for each city to assess the health impacts of HAPs on the residents of Seoul and Incheon. The cancer risk for an individual species was calculated using the citywide average and its corresponding UR. The UR is defined as the excess lifetime risk of a chronic exposure to 1 μg/m 3 of a pollutant (WHO 2000). The USEPA suggested 1 × 10 -6 and 1 × 10 -4 as acceptable and tolerable  Table 3, in descending order of risk. UR data were available for 49 species (see Tables S4 and S6 for details), and descriptive statistics are given for the top 25 HAPs after excluding rarely detected (DF < 30%) or lowrisk species. For the risk calculation, concentrations of coeluted PAHs were divided equally. The cancer risks associated with individual HAPs were generally higher in Incheon than in Seoul. BaP had the highest cancer risk in both cities, followed by FALD, As, and benzene. The number of HAPs whose cancer risk exceeded 1 × 10 -6 (acceptable) in Seoul and Incheon were 22 and 24, respectively. The cumulative cancer risks (ΣCR) were 2.0 × 10 -4 (Seoul) and 2.7 × 10 -4 (Incheon). In Seoul, VOCs, PAHs, and HMs contributed 43.8%, 42.0%, and 14.2%, respectively to the ΣCR, and in Incheon, the contributions were 37.1%, 46.0%, and 16.9%. The difference in the contributions of HAP subgroups to the ΣCR between Seoul and Incheon indicated that the emission sources of HAPs in the SIMA were predominantly associated with fossil fuel combustion and industrial activities.
Any direct comparison of ΣCR between this and other studies needs to be done cautiously because estimated risks depend on the number of HAP species included in the risk assessment. In addition, the cancer risk of PAHs can vary greatly depending on which UR value is used. In this study, the WHO value for BaP (8.7 × 10 -2 ) was applied, but it is almost 100 times greater than the USEPA UR value (6.0 x 10 -4 ). Recently, Baek et al. (2020a, b) reported cancer risk assessment for a wide range of HAPs in major industrial cities in Korea, such as Pohang and Ulsan. Their results are comparable to those reported in this study in terms of the number of target HAPs, sampling sites, and collected samples. The ΣCR levels in Seoul and Incheon were approximately equal to that of the residential areas (2.3 × 10 -4 for 28 HAPs) of Pohang, a steel industrial city. However, the ΣCR levels were higher than that of the residential areas (1.7 × 10 -4 for 35 HAPs) in Ulsan, a heavily industrialized city with petrochemical, shipbuilding, and automotive industries. It is somewhat surprising that HAP risk in the SIMA was greater than or equal to those in residential areas of industrial cities that have been cited for severe air pollution since the 1990s.
The non-cancer risk for a specific HAP was expressed as an HQ, according to Eq. (5). The HQ is the ratio of the potential exposure level to a substance and the level at which no adverse effects are expected (USEPA 2020). Non-carcinogenic reference values (RfC, REL, or MRL) were available for 63 species (Tables S4 and S6). Only 35 species (DF ≥ 30%) were included in the non-cancer risk calculation, and the HQs for the top 25 species are presented in Table S10. TCE showed the highest HQ for both cities. However, the overall non-cancer risk was less than the overall cancer risk because the only species with an HQ > 1 was TCE in Incheon. The cumulative HQ (ΣHQ) values for the top 25 HAPs were 2.84 and 4.45 in Seoul and Incheon, respectively. The ΣHQ values in this study were comparable to those in the residential areas of industrial cities, such as Pohang (3.4) and Ulsan (2.4), but much lower than the industrial area (10.0) of Ulsan (Baek et al. 2020a, b). VOC, HM, and PAH contributions to the ΣHQ were 59.1%, 30.7%, and 10.2%, respectively, in Seoul and 58.0%, 32.8%, and 9.2% in Incheon. However, the question of whether all HAPs impact the same organ or organ system remains unanswered. Little is known about the interactions among pollutants and the health risks posed by longterm exposure to HAPs. Nevertheless, the high ΣHQ values imply that the SIMA residents have experienced concurrent HAP exposures that may pose non-cancer risks for various endpoints.
There are uncertainties associated with the deterministic screening approach used in this study. For HAPs with known toxicity, most of the uncertainty is related to the assumptions applied to the quantitative derivations of toxicity values, such as UR, RfC, REL, and MRL (Bari and Kindzierski 2018). This study may underestimate or overestimate the health risks. Some hazardous species were not measured in this study (e.g., highly volatile hydrocarbons and vapor-phase PAHs), and some measured species were excluded from the risk assessment due to the lack of toxicity data. Overestimation could occur because (i) more conservative values were applied to some calculations, (ii) mean values from one year measurements were used to calculate long-term (70-year lifespan) toxicity values, and (iii) 50% of the MDL value was substituted when measured values were less than the MDL. Many of these limitations have also been addressed in previous studies Kindzierski 2017, 2018;Jia and Foran 2013;Ramírez et al. 2012;Wu et al. 2009Wu et al. , 2011. In addition, there was a possibility of overestimation of health risks because we determined PAHs and HMs from TSP samples, which would be theoretically higher levels than those from inhalable PM samples (such as PM 10 or PM 2.5 ) (Cheruiyot et al. 2015;Chow 1995).

Source apportionment of HAP concentrations
We applied PMF to the VOC, PAH, and HM datasets from Seoul and Incheon to provide quantitative information about the contributions of each source to the measured concentration of a specific HAP. In this study, the factors identified for each dataset were evaluated for interpretability, plausibility, and the diagnostic results of error estimation. Each task of the PMF was conducted by trial and error, and the final solution was optimized by considering Q roust /Q true , Q true /Q exp , the displacement (DISP) of factor elements, and bootstrap (BS) mapping, as suggested in the PMF 5.0 guidelines. BS-DISP error estimation was not performed because the runs were often aborted during computation for unknown reasons. Consequently, six and five factors were extracted from the VOC and PAH data, respectively, for each city. Six and seven factors were extracted from the HM data for Seoul and Incheon, respectively. A summary of the input data settings for the six PMF tasks and the diagnostics for error estimations is presented in Table S11. The number of variables included in PMF modeling was based on the signal to noise ratio (S/N); each species was categorized as "bad" (S/N < 2.0), "weak" (2.0 ≤ S/N < 3.0), and "strong" (S/N ≥ 3.0). In this study, each PMF solution was adopted from the base runs because there were no significant changes (dQ ≤ 0.5%) in the pattern of factors between the base runs and Fpeak runs. Source profiles extracted from VOCs, PAHs, and trace elements are shown in Figs. 4, 5, and 6, respectively. The source profile is the chemical composition of emission sources. It is expressed as either the absolute concentration of each species apportioned to the factor or the percentage of each species apportioned to the factor relative to the total concentration of the species. For simplicity, the source profiles were expressed as percentages in this study. Overall, the source profiles for Seoul and Incheon were very similar; however, each factor's relative contribution was not identical.

Fig. 4 Source profiles of VOCs in Seoul and Incheon
Consequently, each city's source profile for one group of HAPs is presented in each of the three figures. Additionally, the time series of source contributions for each HAP group and city estimated by PMF modeling are presented in Figs. S6-S11. Factors were initially characterized to obtain an easily interpretable factor, regardless of the order of their appearance in the PMF results.
VOC source profiles The source profiles of VOC data consisted of 20 variables (Fig. 4). The first factor was named as "vehicle exhaust" because MTBE exhibited the highest contribution (68.4% and 59.9% in Seoul and Incheon, respectively, hereinafter). MTBE is blended into gasoline as an anti-knocking agent (Vainiotalo et al. 1998), and is a good indicator of vehicle exhaust in urban areas (Hsieh et al. 2006). A high contribution (43.5% in Seoul) of 1,2,4-trimethylbenzene (124TMB) supported the relationship between this factor and vehicle exhaust (Bari and Kindzierski 2018;Na et al. 2002). However, the first factor seems to be partially associated with industrial sources because ethyl acetate (EACT) in Seoul and vinyl acetate (VACT) in Incheon had the second-highest   contributions to this factor. The dominant species of the second factor were ethylbenzene (EBZN) and xylenes (52%, 57%). These two aromatics are the components of paint thinners and varnishes used in architectural coatings and the printing, rubber, and leather industries (Bari and Kindzierski 2018;Kim et al. 2020;Xiong et al. 2020); therefore, this source was called "petroleum solvent" to distinguish it from chlorinebased and carbonyl solvents. The third factor had high contributions of benzene (43.5%, 71.7%), hexane (46.5%, 35.4%), and cyclohexane (38.2%, 27.4%). We named this factor "fuel combustion" because these VOCs are associated with combustion sources (Bari and Kindzierski 2018;Xiong et al. 2020). This factor may be related to the evaporative emission of petroleum under high-temperature conditions (Xiang et al. 2012). However, the time series of this factor (Figs. S6 and S7) showed that the contributions were higher in cold seasons than warm seasons, indicating a correlation with residential heating. The fourth factor was mainly associated with carbonyl compounds (43.1%-76.7%), which have atmospheric behavior and emission sources that differ from other VOCs. The fifth and sixth factors had contributions from organic solvents widely used in industry, but the data indicated two independent factors. The fifth factor was associated with isopropyl alcohol (IPA), EACT, VACT, and toluene, whereas the sixth factor was associated with TCE, methyl isobutyl ketone (MIBK), toluene, EBZN, and xylenes. Significant amounts of these substances were discharged from Incheon and SBNIC as fugitive emissions (Table S12), according to the Korean pollutant release and transfer registry (PRTR) data (KNIC 2015), the atmospheric levels of these industrial solvents in Seoul seem to be affected by the transport or dispersion of emissions in Incheon and the SBNIC. Therefore, the two factors were named "industrial emission (I) and (II)." PAH source profiles As shown in Fig. 5, the source profiles of PAH data consisted of 21 variables encompassing PAHs with 3-6 rings. The first factor had high contributions from coal combustion markers, such as FLRTH (73.8%, 43.4%) and pyrene (58.8%, 45.4%) (Khalili et al. 1995;Larsen and Baker 2003;Wang et al. 2014), and moderate contributions from chrysene, benzo(k)fluoranthene (BkF), I123P. Diagnostic ratios for FLRTH/(FLRTH+pyrene) and I123P/ (I123P+BghiP) of >0.5 have been used to identify coal combustion (De La Torr-Roche et al. 2009;Yunker et al. 2002): mean values of these two ratios for the first factor were 0.66 and 0.58, respectively. Therefore, we named this factor "coal combustion." According to the time series data (Figs. S8 and S9), the source contributions were higher in cold seasons than warm seasons, indicating that domestic coal-fired power plants and transboundary transport were affecting sources. Phenanthrene (PHEN) and 3-4 ring PAHs were distinctly abundant in the second factor. Low molecular weight (MW) PAHs are indicators of petrogenic sources, including crude oil and petroleum residues (Liu et al. 2018;Ravindra et al. 2008;Taghvaee et al. 2018;Wang et al. 2014;Wang et al. 2016). Low MW PAHs have also been attributed to oil combustion sources (Agudel-Castaneda and Teixeira 2014; Chen et al. 2016;Larsen and Baker 2003;Mishra et al. 2016) and to the evaporation of petroleum in warm seasons (Mishra et al. 2016;Wang et al. 2016). Petroleum combustion has often been identified based on the diagnostic ratios for FLRTH/(FLRTH+ pyrene) of 0.4-0.5 (De La Torr-Roche et al. 2009) and I123/(I123P+BghiP) of 0.2-0.5 (Yunker et al. 2002). A ratio of BaA/(BaA+chrysene) less than 0.2 was also suggested as an index for petrogenic sources of PAHs; while the ratio of 0.2-0.35 indicate coal combustion (Akyüz and Çabuk 2010).
In this study, those ratios were found to be 0.58, 0.48, and 0.09, respectively. Although these values do not fully meet the existing criteria, this factor was named "oil combustion" in consideration of the uncertainty of the diagnostic ratios (Tobiszewski and Namieśnik 2012). Temporal pattern of this factor was similar to that of coal combustion (Figs. S8 and S9), reflecting a relationship with fuel combustion during the heating period. The third factor was characterized as "diesel vehicle exhaust" because it was associated with 5-ring PAHs, encompassing BkF to BghiP. The source profiles of diesel and gasoline vehicles are often similar because they share some PAHs (Callén et al. 2014;Chen et al. 2016;Mishra et al. 2016;Ravindar et al. 2008). However, the relative abundance of benzofluoranthenes, I123P, and BghiP are greater in diesel emissions than gasoline exhaust (Agudel-Castaneda and Teixeira 2014; Larsen and Baker 2003;Liu et al. 2018;Khalili et al. 1995;Ravindra et al. 2008;Simcik et al. 1999;Taghvaee et al. 2018). The fourth factor overlapped the third factor in the four-factor solution, but it was an independent factor in the five-factor solution, implying that the two factors are derived from vehicle exhaust. Coronene exhibited the highest contribution (64.9%, 45.9%) in the fourth factor. The contributions of anthanthrene, BghiP, I123P, and dibenzo[a,h]anthracene+dibenzo[a,c]anthracene (DahA+ DacA) in the fourth factor were higher than those of benzofluoranthenes (tracers of diesel vehicles). Gasoline vehicles are sources of higher MW PAHs (Khalili et al. 1995;Miguel et al. 1998;Schauer et al. 2002). The time series of the third and fourth factors were alike, with no significant differences between the heating and non-heating periods (Figs. S8 and S9). Accordingly, this factor was named "gasoline exhaust," as in other studies (Chen et al. 2016;Khalili et al. 1995;Larsen and Baker 2003;Ravindra et al. 2008). Diagnostic ratios for BaP/(BaP+chrysene) of 0.5 and 0.7 have been suggested to identify diesel and gasoline emissions, respectively (Khalili et al. 1995;Ravindra et al. 2008 (Callén et al. 2014;Mishra et al. 2016;Simcik et al. 1999;Taghvaee et al. 2018); however, wood and biomass burning are also sources of BaA, chrysene, and BaP (Agudel-Castaneda and Teixeira 2014; Taghvaee et al. 2018;Wang et al. 2014). Therefore, we named this factor "natural gas/biomass burning." Town gas is produced from naphtha and natural gas and is a primary residential fuel in Seoul and Incheon. Wood/agricultural burning in the outskirts of the SIMA during cold seasons was identified by PMF modeling as a significant PM source in Seoul (Park et al. 2019). The contributions of this factor were significantly higher in the heating period compared to the non-heating period (Figs. S8 and S9). There are few diagnostic ratio data for natural gas, and the ratios for the combustion of biomass (wood and grass) have been reported in the same category as coal combustion in the literature (Tobiszewski and Namieśnik 2012). Therefore, it is difficult to evaluate the "natural gas/biomass burning" factor with respect to the PAH diagnostic ratios.
HM source profiles The source profiles of trace elements, including six toxic HMs, and OC/EC are shown in Fig. 6. In this study, OC/EC were analyzed from PM 2.5 samples which were collected simultaneously along with the TSP samples. The first factor was named "soil dust resuspension" because the soil components (e.g., Al, Ca, Fe, K, Mg, Mn, and Ti) had high contributions (Chow 1995;Tsai et al. 2020). The second factor was named "oil combustion" because of high contributions of V and Ni, markers of the combustion of diesel and residual oils (Belis et al. 2013;Chow 1995;Park et al. 2019). The third factor was characterized as "coal combustion," owing to the high contribution of As (Chow 1995;Tsai et al. 2010;Vejahati et al. 2010). The profiles of the fourth factor exhibited high contributions from Zn, Fe, Mn, OC, and EC, for which the most common source in urban areas is traffic pollution (Park et al. 2019). Zn is emitted from the lubricating oil of diesel vehicles (Belis et al. 2013;Zhou et al. 2009), and Fe is found in the catalyst of gasoline cars (Cesari et al. 2016;Zhao and Hopke 2004). Fe and Zn are also markers of brake and tire wear (Hjortenkrans et al. 2007;Napier et al. 2008). Therefore, we named this factor "traffic" to encompass vehicle exhaust and tire/brake wear. The fifth factor had the highest contributions of Cd (72.6%, 75.8%), followed by Pb (46.9%, 40.9%), which are markers for the "incineration" of municipal wastes (Belis et al. 2013;Chow 1995;Park et al. 2019). Na and Mg were the first-and the second-abundant elements in the sixth factor, named "marine aerosol." The seventh factor was derived only from Incheon data, and it was named "industry." It differed from the fourth factor, having high contributions of Co, Ni, and Pb. Industries in the SIMA are likely sources because Co is used in rechargeable batteries, super-alloys, paint, and petrochemical industrial processes. We calculated the contributions of each source to HAP concentrations after identifying the sources. Examples of major HAPs in Seoul and Incheon are presented in Table 4. The PMF model accounted for 98.1% and 99.2% of the total concentrations of the 20 VOCs used in the modeling of the Seoul and Incheon data, respectively. The PMF model explained 100.2% and 99.6% of the total concentrations of the 21 PAHs for the two cities, respectively, whereas it underestimated the seven toxic HMs (93.8% for both cities). Good correlations between estimated and measured values were obtained for most HAPs; R 2 values close to one, and the slopes and intercepts of regressions were close to one and zero, respectively. and Pb (0.5 μg/m 3 ). Therefore, it is necessary to introduce new standards for ambient levels of BaP and FALD because these two HAPs ranked high in terms of toxicity, abundance, and ubiquity. In Korea, the annual limit of ambient concentration of PM 2.5 was strengthened in 2018 (from 25 to 15 μg/m 3 ). However, it is further necessary to revise current emission standards from TSP to PM 2.5 because particulate HAPs, such as PAHs and toxic HMs, are predominantly associated with PM 2.5 . Finally, HAP control measures within South Korea alone may not be the sole solution since the contributions of transboundary transport of HAPs from China and North Korea tend to be significant during cold seasons. Therefore, international cooperation and joint efforts to improve air quality in Northeast Asia are considered to be very important tasks.