The impact of coal trains on PM2.5 in the San Francisco Bay area

Exposure to fine particulate matter (PM2.5) is associated with adverse health effects, including mortality, even at low concentrations. Rail conveyance of coal, accounting for one-third of American rail freight tonnage, is a source of PM2.5. However, there are limited studies of its contribution to PM2.5, especially in urban settings where residents experience higher exposure and vulnerability to air pollution. We developed a novel artificial intelligence-driven monitoring system to quantify average and maximum PM2.5 concentrations of full and empty (unloaded) coal trains compared to freight and passenger trains. The monitor was close to the train tracks in Richmond, California, a city with a racially diverse population of 115,000 and high rates of asthma and heart disease. We used multiple linear regression models controlling for diurnal patterns and meteorology. The results indicate coal trains add on average 8.32 µg/m3 (95% CI = 6.37, 10.28; p < 0.01) to ambient PM2.5, while sensitivity analysis produced midpoints ranging from 5 to 12 µg/m3. Coal trains contributed 2 to 3 µg/m3 more of PM2.5 than freight trains, and 7 µg/m3 more under calm wind conditions, suggesting our study underestimates emissions and subsequent concentrations of coal train dust. Empty coal cars tended to add 2 µg/m3. Regarding peak concentrations of PM2.5, our models suggest an increase of 17.4 µg/m3 (95% CI = 6.2, 28.5; p < 0.01) from coal trains, about 3 µg/m3 more than freight trains. Given rail shipment of coal occurs globally, including in populous areas, it is likely to have adverse effects on health and environmental justice.


Introduction
Myriad large epidemiological studies definitively establish that exposure to fine particulate matter (particles less than 2.5 microns in diameter or PM 2.5 ) is associated with a wide range of adverse health effects. Exposure to PM 2.5 has been linked to premature mortality, cardiovascular, cerebrovascular, and respiratory diseases, other chronic diseases, adverse birth outcomes, and cognitive and developmental impairments (WHO 2021, U.S. EPA 2019). These effects occur even at concentrations lower than current regulatory standards (Brunekreef et al. 2021).While most studies of PM2.5 have examined daily or multi-year exposures, there is evidence of health effects from exposures of as short as one hour (Liu et al. 2021;Peters et al. 2001;Wu et al. 2020). The World Health Organization recently lowered their air quality guidelines and indicated there is no known safe level of PM 2.5 (U.S. EPA 2019; WHO 2021). The recent study of the Global Burden of Disease estimates that exposure to PM 2.5 contributed to 6.7 million deaths per year worldwide, nearly 12% of the global total and the fourth highest risk factor for global mortality (Fuller et al. 2022). Of note, exposure to PM 2.5 constitutes an environmental justice concern as exposure and adverse effects are borne disproportionately by the most vulnerable, including infants, children, the elderly, people of color, those with low incomes, and those with underlying health conditions (Tessum et al. 2021).
Recent studies report that the combustion of fossil fuels, including coal, oil, and natural gas, is the largest source of ambient PM 2.5 -related mortality with coal the largest source of this mortality (Vohra et al. 2021;McDuffie et al. 2015). Combustion, however, is not the only source of coalrelated particulate matter as fugitive dust from rail transport is known to be significant (BNSF Railway 2011). Trains transport nearly 70% of coal deliveries in the United States, 1 3 with coal accounting for 1 of every 3 tons of American rail freight (US Energy Information Administration 2022). In a note to its customers, the BNSF Railway's own assessment stated: "The amount of coal dust that escapes from PRB [Powder River Basin in Wyoming and Montana] is surprisingly large" and reports have indicated that as much as 3% of the coal loaded into a coal car can be lost in transit (Baruya 2012;BNSF Railway 2011). Studies have confirmed that coal trains produce particulate matter through not only engine diesel emissions but also directly from the coal. These latter emissions are via blow-off, suspension, and re-entrainment from wind erosion and wind scouring of loaded and unloaded coal cars, door leakage, and the "parasitic load, i.e., coal spilled and carried on external parts of the train (Prakash et al. 2018). The magnitude of ambient particulates from coal trains are influenced by train and wind speed, weather, moisture, rail car and load geometry, physical properties of the coal, vibration, and the use and efficacy of dust suppression methods (Prakash et al. 2018). Unfortunately, the actual contribution of coal trains to ambient PM 2.5 is poorly documented.
Given the dearth of studies quantifying the effects of coal transport on subsequent concentrations of particulate matter and the significant health implications of exposure to particulate matter, additional study is warranted. Below, we report results from the novel monitoring system we developed and utilized to quantify the contribution to ambient PM 2.5 from uncovered railcars that convey coal predominantly from mines in Southern Utah to the Levin Terminal in Richmond, California.

Data collection
Particulate matter from coal is known to contain many impurities and elements including heavy metals known to be toxic or carcinogenic to humans (OEHHA 2015). Specifically, the coal of interest in this study originated from the Wasatch Plateau coal fields, a coal-bearing outcrop approximately 145 km long and 11 to 32 km wide (Hatch et al. 1979). Previous assessments have determined coal from the plateau to be high volatile bituminous (Hatch et al. 1979). The coal is primarily carbonaceous with various inclusions and impurities, including several mineral species along with elemental impurities of Cr, Ni, and Se. There are also trace elements including As, Ba, Cd, F, Mn, Sb, Sr, Th, U, and V (Hatch et al. 1979).
To determine the PM 2.5 concentration resulting from passing full and unloaded ("empty") coal, freight and passenger trains, passing trains were monitored from May 19, 2022 through October 31, 2022 at a populated residential site approximately 7 km north of the terminal. The site is near the culmination of an 800-mile journey, thereby capturing the realistic conditions of long-haul coal conveyance as compared to the conditions at departure where dust suppressants are freshly applied, and trains are optimally loaded. The monitoring site is approximately 21.5 meters east (generally downwind) of the rail line, with parkland to the east and the San Francisco Bay to the west (Fig. 1). The site was selected to avoid PM 2.5 from other important sources such as major roadways, industrial facilities, Richmond port operations and the Levin terminal itself. This location and our study methodology ensured that any observed changes in PM 2.5 as the trains passed were strictly due to the trains themselves.
The train monitoring system comprises three data collection systems: 1. A personal weather station 2. An air quality sensor 3. A custom camera system The personal weather station was selected for direct data output via serial communication (VantageVue, Davis Instruments, USA). It provides temperature, ambient pressure, relative humidity, precipitation, and other meteorological parameters. The meteorological data is collected every oneminute. In addition, hourly wind speed and direction were derived from the NOAA site in Richmond for comparison.
The air quality sensor is a custom package consisting of three optical PM sensors (PMS5003, Nanchang Panteng Technology Co., Ltd, China). These are equivalent to cellreciprocal nephelometers and are commonly recognized as the sensors used in the widely-distributed PurpleAir PA-II monitor (Ouimette et al. 2022). The sensor responds to optical scattering from a 657 nm laser. Therefore, it is associated with mass via the mass scattering coefficient, which is a function of the chemical, morphological, and optical properties of the observed particles. The accuracy of this determination is governed by the variability of particle characteristics in the temporal and spatial dimensions. The sensors' high temporal resolution of one second and their inter-instrument precision, as assessed by numerous field and laboratory studies, were the principal qualities that enabled the detection of rapid train events (Tsai et al. 2020;AQ-SPEC 2022). Three channels were included to strengthen data quality control and calculate variance for each observation. The raw data from all three sensors was collected every second.
Data quality metrics of the PM2.5 data were evaluated for 1 s, 10 s, and 10 min, equivalent to instantaneous readings, train event averaging, and pre-event background conditions, respectively. Prior to evaluation, the data was cleaned to remove aberrant sensor readings. Specifically, values outside two standard deviations were omitted. In all cases, these values were excessively high readings from the low-cost sensor. The observations used in the subsequent statistical analysis ranged from 0 to 117.45 µg m −3 with a median uncertainty of 27%, well within the linearity range of the sensors of < 300 µg 3 (Barkjohn et al. 2022). The custom camera system consists of a microcomputer (Jetson Nano, Nvidia, USA), a camera (NoIR PiCamera, Raspberry Pi), an artificial intelligence (AI) accelerator (Coral Edge TPU, Google, USA), a solid-state hard drive (500 GB T5, Samsung, S. Korea), and an infrared floodlight (IR Illuminator 30 deg, Axton Technologies, USA). The system is placed approximately 60 m from the chosen source and operates autonomously on a continuous basis, except for a daily 30-min period when data is being uploaded to a cloud server (Lightsail, Amazon Web Services, USA).
The camera system is the pivotal technology that enables detection of passing trains. Images from the camera are passed to the computer at 30 frames per second, where they are pre-processed and passed to the AI accelerator. The accelerator is a Tensor Processing Unit (TPU, Coral Edge TPU, Alphabet, USA), which runs an image classification model customized for the monitoring location. This model identifies whether or not a train is present in the image. If so, the computer creates a train event and records: one second before the train was detected, the entire train event, and one second after the train is no longer detected. This recording is saved as an individual train event to an external hard drive. Train speed (meters per second) and the train direction towards or away from the terminal were determined during manual post-processing of the data. Determining object velocity from video recordings is error prone due to variable image processing rates. Instead, train speeds were estimated by using the average frame rate (frames per second) recorded during the monitoring period and fixed observation points in the camera's field of view. A schematic diagram of the system is presented in Fig. 2.
For each 24-h period, data was aggregated from all three data sources and standardized into one second observations for each measurement parameter including meteorology, PM 2.5 concentrations and train detection. During this < 30-min period, the monitoring systems were disabled and the data file along with associated video files were uploaded to the cloud server. The data aggregation and upload period were scheduled in the early morning hours when train activity was determined to be consistently absent. Files located in the cloud were retrieved at the user's convenience for post-processing, which consisted of associating particulate matter and meteorological data with the observed train events based on the shared data timestamps. Accurate date and time determinations were ensured by consistent internet connection and verification by the operating system. Further detail on the derivation of the variables used in our analysis is provided in Appendix A.

Data management
The PM 2.5 during train passage was recorded in one-second concentrations and averaged for the roughly 4 to 5 min of passage (longer for freight trains). In addition to the PM 2.5 average during the passage of the train, the maximum 10-s average concentrations during the train passage were also recorded and analyzed in order to compare with previous studies.
To determine the change in PM 2.5 due to passing trains, we quantified the difference between the measured PM 2.5 at the rail site and a "control" period of exposure. The control, also considered a "pre-exposure" period, corresponded to the period just prior to a train's passage, allowing capture of ambient PM 2.5 without the train's contribution as well as controlling for normal diurnal and regional changes in PM 2.5 concentrations. A generally similar approach was used by previous studies (Jaffe et al. 2015;Akaoka et al. 2017). We also established a gap between the control period and the train passage to ensure that particles influenced by the high-pressure zone in front of an oncoming train would not be included in the control.
To select the duration of the control exposure and this gap immediately before the train passage, we examined several alternatives including: a five-minute average ending with a 2-minute gap before the train (5/2) as well as 3/2, 5/5, 10/2, 10/5 and 10/10. Ultimately, the results were insensitive to the alternative control and gap periods, so only the results using 10/10 are reported below.

Analysis
We addressed several issues including whether full or empty coal cars contribute to local ambient PM 2.5 concentrations and, if so, by how much. We also compared the impacts of coal cars relative to those of both freight and passenger trains. Using multiple linear regression with the change in PM 2.5 concentration as the dependent variable, our model included binary variables for each of the four train types (passenger, freight, empty and full coal) and examined and controlled for potential confounders. For example, a previous study found a strong association between PM 2.5 from coal trains and the effective wind speed (the sum of train and wind speed) (Jaffe et al. 2015). To test the sensitivity of our results to the model specification, we examined the impact of several covariates including train speed, wind speed, effective wind speed, duration of exposure (based on the elapsed time of the train passing), average temperature, dewpoint and relative humidity. The inclusion of humidity served to control for the potential impact of the hygroscopic property of fine particles when measured with optical sensors. We ran the model without a constant term, which facilitated the direct comparison of the impact among the train types. The model results were identical to a model that adds a constant term and drops one of the train types to avoid multi-collinearity.
Additional sensitivity analysis included examining the impact of converting the negative values for the change in PM 2.5 into zero values. The negative change from the control period could be a result of significant dust from activities at the monitor's residential location, dust from trains occurring in the control period, or from a sudden change in wind speed or direction prior to the train arrival. We also considered subsets of certain covariates. For example, we examined those days where wind was below the mean level of 3.1 mph since these calmer periods may relate to higher concentration at the nearby monitor, whereas particles may disperse to a larger area under other wind conditions. Finally, we tested a model where the air quality sensor was calibrated and directly corrected for relative humidity using the closest Federal Equivalency Method (FEM) monitor to our site. This monitor was located in nearby San Pablo (Air Quality System Site ID: 06-013-1004), 1.6 km from our site and generated the following fit, with an R 2 of 0.58: where PM2.5_C is the calibrated and corrected concentration of PM 2.5 , and PM_PA is the original reading at the train site. In addition to the average change in PM 2.5 (difference of PM 2.5 during train passage and the control), the maximum (10 s average) concentration relative to the control period was analyzed to compare with findings of previous studies.

Results
Ultimately, during the six-month observation period, the increases in ambient PM 2.5 concentrations were measured during the passage of four different train types. Complete data were available for full coal trains (n = 15), empty coal trains (n = 14), freight trains (n = 568) and passenger trains (n = 2235) as identified by the video recordings from the camera system described above. There were some significant differences between characteristics of the train types (see Appendix B for detailed summary statistics). For example, focusing on freight trains versus full coal trains, the mean PM2.5_C = 9.79 + .76 * PM_PA − 0.095 * humidity duration (in seconds) and speed (m/s) for the former were 236 and 18.3, versus 144 and 12.5 for the latter. At the other extreme, the means of these same parameters for passenger trains were 2.2 s and 31.7 m/s.
The results for the basic regression model are presented in Table 1. As expected, wind and train speed were both statistically significant. In addition, the passage of an empty coal car contributed about 2.3 µg/m 3 (95% CI = -0.28, 4.82; p < 0.1) to the ambient air, while freight and full coal cars contributed 4.5 µg/m 3 (95% CI = 3.82, 5.18; p < 0.01) and 6.8 µg/m 3 (95% CI = 4.34, 9.24; p < 0.01). Controlling for the direction of the freight train did not alter the results. This finding indicated that the regression coefficients of these three train types (freight and full/empty coal) were statistically significant from zero and also statistically different from each other. In contrast, the PM 2.5 increment from passenger trains was relatively small and not significantly different from zero, so it was not included in the sensitivity analyses below. The amount of explained variation from the basic model was relatively low at 16%.
The findings of the sensitivity analysis are displayed in Table 2. Given the null impact of passenger trains, further results for this mode were not included. Model (1) reproduces the results of the basic model. Model (2) added the average temperature during the one-hour average that included the train passage, and resulted in increases in the PM 2.5 impact for all three train types with empty coal, freight and full coal cars contributing 5.6 µg/m 3 (95% CI = 2.5, 8.7), 7.5 µg/m 3 (95% CI = 5.8. 9.2) and 9.7 µg/m 3 (95% CI = 6.8, 12.6), respectively. All were statistically significant with p < 0.01. Model (3) indicates the impact of adding humidity which resulted in reductions of approximately 2.5 µg/m 3 from the basic case. Model (4) again includes humidity but assigns a zero value when the change in PM 2.5 was negative. This adjustment slightly increased the PM 2.5 contribution of all of the train types. Model (5) adds a control for dewpoint, a combination of temperature and humidity, which resulted in an increase in the change in PM 2.5 from the basic model, while in Model (6) observations are restricted to those occurring during calm wind conditions (less than the mean of 3.0 mph). This constraint significantly increased the contribution of coal trains to ambient PM 2.5 to 12.1 µg/ m 3 (95% CI = 7.7, 16.5; p < 0.01) versus 5.1 µg/m 3 (95% CI = 3.8, 6.4; p < 0.01) for freight cars. Finally, Model (7) uses the data from the calibrated PM 2.5 concentrations and generated statistically significant estimates of 8.3 µg/m 3 (95% CI = 6.4, 10.3; p < 0.01) and 6.5 µg/m 3 (95% CI = 6.0, 7.1; p < 0.01), respectively, for full coal and freight trains. Models (1) through (6) each exhibited modest R 2 less than 0.19. However, the calibrated Model (7), which provided a robust correction for humidity, explained 53% of the variation in the change in PM 2.5 . Additional model specifications of Model (7) with covariates used in the earlier models such as train duration, effective wind speed or quadratic terms failed to improve the model fit.   Table 3 displays the regression results for the increase in peak (10 s average) PM 2.5 concentrations above the control concentrations during the passing of full coal cars (n = 18), empty coal cars (n = 16) and freight cars (n = 653). Results for passenger trains were not included since few of these trains had durations that were 10 seconds or more. The model specifications were similar to those used in the previous analyses and included wind speed, train speed and the 3 train types. Given the above findings, we focused on 3 different models: a basic model (Model 1), a model corrected and calibrated for humidity as above (Model 2), and the calibrated model under calm wind conditions defined as average wind less than the mean (Model 3).

Discussion
Our results indicate that the average change from passing coal trains adds approximately 8.32 µg/m 3 (95% CI = 6.37, 10.28; p < 0.01) to the ambient PM 2.5 , with a range of midpoint estimates, based on the sensitivity analysis, of 5 to 12 µg/m 3 . These results also suggest that full coal cars contribute approximately to 2 to 3 μg/m 3 of PM 2.5 more than freight trains observed in our Richmond, California sample. Strikingly, with very calm winds, the nearby concentrations from coal trains were about 12 μg/m 3 versus 5.1 for freight trains. This suggests the possibility of our study underestimating the emissions and overall impact of dust from coal trains, since on windier days the dust may simply be dispersed over a wider region beyond our monitoring site. We also observed that unloaded coal cars tended to add 2 μg/ m 3 of PM 2.5 to the existing ambient concentrations, with a range from our sensitivity analysis of from about one (nonsignificant) to over 5 µg/m 3 . Regarding peak (10 s) concentrations of PM 2.5 , the calibrated model indicated an increase of 17.4 µg/m 3 (95% CI = 6.2, 28.5) from coal trains which tended to contribute about 3.5 µg/m 3 more than freight trains across the models examined. Calm wind conditions resulted in an increase from coal trains of 20 µg/m 3 (95% CI = 3.4, 36.6; p < 0.01).
Given the known bias of humidity on optical PM monitors, in addition to controlling for humidity and dewpoint directly in the model specification, a regression model was estimated using data calibrated and corrected for humidity using a nearby FEM monitor (Barkjohn et al. 2021). It is well established that mass calibrations of optical sensors are temporally and spatially dependent on particle optical characteristics (Dubovik et al. 2002;Bond and Bergstrom 2006). The assumption here is that consistent calibration factors from monitors within the same geographic region and time period are reasonable surrogates for in situ calibration.
There are only a few previous studies that have measured PM 2.5 concentrations from coal trains. One study examined coal and freight trains passing through the rural Columbia River Gorge (Washington) in the summer of 2014 (Jaffe et al. 2015). The study examined the difference between the 10 s maximum PM 2.5 and the background concentration. The authors observed a doubling in peak concentration for coal trains (20.9 µg/m 3 ) versus freight trains (10.7 µg/m 3 ). This is consistent with our results for coal trains using a similar averaging time of 17.4 µg/m 3 . The average effective wind speeds in the Jaffe study were much higher than those in our study and were often associated with very high concentrations of PM 2.5 . This suggests that PM 2.5 concentrations associated with train passage are likely to be even greater in certain areas farther away from the City of Richmond's urban setting due to greater train speeds.
A previous study collected data on coal trains operating in the Fraser River Delta area of British Columbia, Canada. In comparing ambient air impacts of the coal trains (n = 20) to background concentrations, the results suggested an increase of 5.3 (a 54% increase over background), 4.1, and 2.6 µg/ m 3 , respectively, for PM 3 (comparable to PM 2.5 ), PM 10 , and PM 20 , with occasional spikes in PM 3 from coal trains to 100 µg/m 3 (Akaoka et al. 2017).
Another study collected data on a single day from four monitors located at varied distances from the train line on full (n = 10) and empty (n =11) coal trains heading to and from the Port of Newcastle in New South Wales, Australia  (Higginbotham et al. 2013). For full coal cars, there were increases of 2.9 and 7.2 µg/m3, respectively for PM 2.5 and PM 10 and 7.1 and 18.9 for empty coal cars. Higher impacts for empty coal cars were also reported in studies by Katestone Environmental Pty Ltd (2013). Finally, Ryan and Wand (2014) analyzed the impacts of freight, empty coal and full coal trains in the Hunter Valley in New South Wales, Australia (Ryan and Wand 2014). The crude (unadjusted) increases in PM 2.5 for passing freight, empty coal and full coal cars was 0.53, 1.13 and 1.20 µg/ m 3 , respectively; all statistically significant differences from baseline levels. Their measurements indicated that particulate level concentrations were elevated not only during but also prior to and especially after a train's passing.
Most of the dust from coal trains occurs from the rail car (80%), with spilled coal (9%) and door leakage (6%) being other sources (Connell Hatch 2008). A consequence is coal dust deposition, with studies finding that, on average, coal composed 6-25% of deposited dust in rail corridors, although Akaoka et al. reports up to 90% in local dust (Akaoka et al. 2017;DSITIA 2015). Evidence indicates that particulate matter from coal trains, storage and open mines can disperse at least 500 m from the source (Trivedi et al. 2009;Akaoka et al. 2017;Srivastava et al. 2021;Sahu and Pakra 2022).
To put our results into perspective, the current U.S.EPA 24-h and annual average standards are 12 and 25 μg/m 3 , respectively, while the World Health Organization guidelines for the same averaging times are 5 and 10 μg/m 3 (U.S. EPA 2019; World Health Organization 2021). In addition, both U.S.EPA and WHO indicate that there is no threshold or safe level for ambient PM 2.5 . Therefore, a hypothetical three coal trains per week in an urban area could represent an important increase in PM 2.5 to nearby residents. Incremental concentrations would subsequently increase the risk of a wide range of health effects including: premature mortality, cardiovascular and respiratory hospitalization or urgent care visits, increases in or exacerbation of asthma, adverse birth outcomes (e.g., low birth weight, prematurity, birth defects and neurodevelopment), possible neurological impacts in children and adults (autism, Alzheimer's, Parkinson's) as well as functional impacts such as days with respiratory symptoms, restricted activity, and work or school loss (WHO 2021). As noted above, even acute PM 2.5 exposures as short as one hour (or a few hours) can increase the risk of adverse health outcomes, including: acute myocardial infarction, hospitalization and emergency department visits for cardiovascular and respiratory disease, ambulance calls and asthma exacerbation (Yorifuji et al. 2014;Kim et al. 2015;Chen et al. 2019).
Our study has several advantages including the development of an AI-based platform for precise identification of train types during the day or night; real time measurement of PM 2.5 and meteorology; siting of a monitor with only the trains as a source of PM 2.5 ; and the ability to produce data on train direction and speed. There were also some shortcomings in our study. There was only a small number of full and unloaded coal cars due to the reduction in economic activity during the COVID-19 pandemic and related supply chain issues. There was only a single monitor to measure the impact of passing trains. This was due to both logistical constraints pursuant to the COVID-19 pandemic and the difficulty in finding monitor host sites that were not impacted by other PM 2.5 pollution sources in Richmond, a city transected by major highways, refineries, other heavy industry and a port. There is the possibility of exposure misclassification if some of the freight trains also included coal cars. The low R 2 in some of the regression models could be due to several factors including the assignment of hourly wind, temperature and humidity to the 4-5 min of train passage and uncertainty in estimating train speed and length. There were also unmeasured factors such as train weight and number of engines. Finally, it is important to note that our analysis did not include measurements of either ultrafine (particles less than 0.1 micron) or coarse particles (PM10) which will always be generated from the passing trains. Since there is substantial evidence of adverse health effects from both of these particle sizes, the actual health risks posed by passing coal trains are clearly underestimated in this present study (Adar et al. 2014;Ostro et al. 2015).
Identifying the source of fugitive dust is important in part because the implications of exposure extend beyond individual and population health effects to matters of environmental and racial justice (Mikati et al. 2018). While coal dust can have farranging population exposures, the communities in relatively close proximity to the rail lines will be disproportionately exposed. These residents are more likely to be of lower-income or people of color (or both) and also more vulnerable to adverse health outcomes (Hricko et al. 2014;Jha and Muller 2017).
Finally, the impacts of the rail transport of coal are compounding because it involves traversing thousands of kilometers, meaning multiple environmental justice communities are impacted. Ecosystems such as rivers and coastlines also receive extended exposure as the rails often trace their contours. Further, the climate change implications of coal transport, storage and handling are significant, ultimately resulting in up to 16% of US carbon pollution (Meyer 2019).

Conclusion
In this paper, we have reported evidence of significant increases in PM 2.5 due to passing coal-carrying trains in Richmond, California. The observed increases were greater than those produced by freight trains and passenger trains. Unloaded coal cars also generated increases in PM 2.5 , but at lower concentrations than full coal cars. Quantifying the contribution of coal trains in urban air 1 3 populations is important since vulnerable communities are typically found in close proximity to rail lines. In addition, inevitable dispersion of PM 2.5 will increase population exposure over a much wider area. Since shipment of coal by train occurs throughout the world and for many urban areas, it represents a significant public health hazard. Finally, to overcome technical challenges that have historically been barriers to the study of coal trains, we developed an artificial intelligence-driven monitoring platform to detect and quantify air pollution from passing trains. These advancements will contribute to future studies of health effects from mobile sources.

Appendix A. Data field descriptions
The following discussion provides more background regarding the data fields, their units, and their calculation.
• TrainStart: -This field reports the date and time, to the second, that a train was observed by the customized artificial intelligence camera system • TrainType: -This study classified the observed trains into four types: Passenger -either Amtrak or CalTrain trains Freight -Union Pacific trains that are carrying various products on rail cars but not identifiable coal-bearing cars Coal -Union Pacific trains exclusively carrying coal hopper rail cars, either full or unloaded • PreTrainPM#: -These fields represent the air quality levels prior to the arrival of a train. We explored various combinations of averaging times (1, 3, 5 and 10 min) and gap lengths between the PreTrainPM average and the observed train start time (1, 2, 5 and 10 min. For example, PreTrainPM10-10 utilizes a 10-min average PM 2.5 that is a 10-min gap between the averaging period and the train event start. This field is presented in micrograms per cubic meter. • AvgPM: -This field is the average PM 2.5 concentration, in micrograms per cubic meter, observed during the time that a train was observed at the monitoring location. Note that each second of recorded data is the average of three separate PM sensors, and this value is the average of those averages. • MaxPM: -This field is the maximum 10-s PM 2.5 concentration, in micrograms per cubic meter, observed during the time that a train was observed at the monitoring location. Again, the data point is the average of three separate PM sensors. • PMdelta_10_10: -These fields represent the difference between the PM 2.5 concentrations observed during the train event and the pre-event levels. They are calculated as: Where XPM is either the AvgPM or the MaxPM, • Meters per second: This field is the estimated speed of the train using a custom video processing algorithm. The video is separated into individual images. Two positions (x-coordinates) are selected on either side of each image to act as positional triggers. The pixel values (in RGB) are collected from every pixel in each of these two lines. As the trains pass, we anticipate the pixel values to oscillate more than in the rest of the image; therefore, a y-coordinate is chosen with the maximum relative standard deviation out of all the y-coordinates in each line. Since each image is related to a distinct timestamp, we compare the first derivative of pixel intensity change in each optical band (R, G, and B) to find the temporal difference in peaks. Specifically, we are calculating the lag time between when the train is observed in each of the two locations. The distance between these two points is calculated from a conversion factor of pixels per meter, as determined by image analysis of Amtrak engines and train cars, whose dimensions are publicly available. Knowing the distance traveled (in meters) and the time difference (in seconds), we converted and calculated the speed in meters per second. • Direction: This field indicates the direction the train was traveling in the observation. The determination of this parameter uses the same algorithm described for the previous parameter (m/s). The derivative peaks analyzed inform whether the train is moving to the right (northward and away from the terminal) or left (southward or towards the terminal), which result in binary values of 1 and 0, respectively. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.