Myriad large epidemiological studies definitively establish that exposure to fine particulate matter (particles less than 2.5 microns in diameter or PM2.5) is associated with a wide range of adverse health effects. Exposure to PM2.5 has been linked to premature mortality, cardiovascular, cerebrovascular, and respiratory diseases, other chronic diseases, adverse birth outcomes, and cognitive and developmental impairments (WHO 2021, U.S. EPA 2019). These effects occur even at concentrations lower than current regulatory standards (Brunekreef et al. 2021).While most studies of PM2.5 have examined daily or multi-year exposures, there is evidence of health effects from exposures of as short as one hour (Liu et al. 2021; Peters et al. 2001; Wu et al. 2020). The World Health Organization recently lowered their air quality guidelines and indicated there is no known safe level of PM2.5 (U.S. EPA 2019; WHO 2021). The recent study of the Global Burden of Disease estimates that exposure to PM2.5 contributed to 6.7 million deaths per year worldwide, nearly 12% of the global total and the fourth highest risk factor for global mortality (Fuller et al. 2022). Of note, exposure to PM2.5 constitutes an environmental justice concern as exposure and adverse effects are borne disproportionately by the most vulnerable, including infants, children, the elderly, people of color, those with low incomes, and those with underlying health conditions (Tessum et al. 2021).

Recent studies report that the combustion of fossil fuels, including coal, oil, and natural gas, is the largest source of ambient PM2.5-related mortality with coal the largest source of this mortality (Vohra et al. 2021; McDuffie et al. 2015). Combustion, however, is not the only source of coal-related particulate matter as fugitive dust from rail transport is known to be significant (BNSF Railway 2011). Trains transport nearly 70% of coal deliveries in the United States, with coal accounting for 1 of every 3 tons of American rail freight (US Energy Information Administration 2022). In a note to its customers, the BNSF Railway’s own assessment stated: “The amount of coal dust that escapes from PRB [Powder River Basin in Wyoming and Montana] is surprisingly large” and reports have indicated that as much as 3% of the coal loaded into a coal car can be lost in transit (Baruya 2012; BNSF Railway 2011). Studies have confirmed that coal trains produce particulate matter through not only engine diesel emissions but also directly from the coal. These latter emissions are via blow-off, suspension, and re-entrainment from wind erosion and wind scouring of loaded and unloaded coal cars, door leakage, and the “parasitic load, i.e., coal spilled and carried on external parts of the train (Prakash et al. 2018). The magnitude of ambient particulates from coal trains are influenced by train and wind speed, weather, moisture, rail car and load geometry, physical properties of the coal, vibration, and the use and efficacy of dust suppression methods (Prakash et al. 2018). Unfortunately, the actual contribution of coal trains to ambient PM2.5 is poorly documented.

Given the dearth of studies quantifying the effects of coal transport on subsequent concentrations of particulate matter and the significant health implications of exposure to particulate matter, additional study is warranted. Below, we report results from the novel monitoring system we developed and utilized to quantify the contribution to ambient PM2.5 from uncovered railcars that convey coal predominantly from mines in Southern Utah to the Levin Terminal in Richmond, California.

Methods and materials

Data collection

Particulate matter from coal is known to contain many impurities and elements including heavy metals known to be toxic or carcinogenic to humans (OEHHA 2015). Specifically, the coal of interest in this study originated from the Wasatch Plateau coal fields, a coal-bearing outcrop approximately 145 km long and 11 to 32 km wide (Hatch et al. 1979). Previous assessments have determined coal from the plateau to be high volatile bituminous (Hatch et al. 1979). The coal is primarily carbonaceous with various inclusions and impurities, including several mineral species along with elemental impurities of Cr, Ni, and Se. There are also trace elements including As, Ba, Cd, F, Mn, Sb, Sr, Th, U, and V (Hatch et al. 1979).

To determine the PM2.5 concentration resulting from passing full and unloaded (“empty”) coal, freight and passenger trains, passing trains were monitored from May 19, 2022 through October 31, 2022 at a populated residential site approximately 7 km north of the terminal. The site is near the culmination of an 800-mile journey, thereby capturing the realistic conditions of long-haul coal conveyance as compared to the conditions at departure where dust suppressants are freshly applied, and trains are optimally loaded. The monitoring site is approximately 21.5 meters east (generally downwind) of the rail line, with parkland to the east and the San Francisco Bay to the west (Fig. 1). The site was selected to avoid PM2.5 from other important sources such as major roadways, industrial facilities, Richmond port operations and the Levin terminal itself. This location and our study methodology ensured that any observed changes in PM2.5 as the trains passed were strictly due to the trains themselves.

Fig. 1
figure 1

Location of monitor site and surroundings

The train monitoring system comprises three data collection systems:

  1. 1.

    A personal weather station

  2. 2.

    An air quality sensor

  3. 3.

    A custom camera system

The personal weather station was selected for direct data output via serial communication (VantageVue, Davis Instruments, USA). It provides temperature, ambient pressure, relative humidity, precipitation, and other meteorological parameters. The meteorological data is collected every one-minute. In addition, hourly wind speed and direction were derived from the NOAA site in Richmond for comparison.

The air quality sensor is a custom package consisting of three optical PM sensors (PMS5003, Nanchang Panteng Technology Co., Ltd, China). These are equivalent to cell-reciprocal nephelometers and are commonly recognized as the sensors used in the widely-distributed PurpleAir PA-II monitor (Ouimette et al. 2022). The sensor responds to optical scattering from a 657 nm laser. Therefore, it is associated with mass via the mass scattering coefficient, which is a function of the chemical, morphological, and optical properties of the observed particles. The accuracy of this determination is governed by the variability of particle characteristics in the temporal and spatial dimensions. The sensors’ high temporal resolution of one second and their inter-instrument precision, as assessed by numerous field and laboratory studies, were the principal qualities that enabled the detection of rapid train events (Tsai et al. 2020; AQ-SPEC 2022). Three channels were included to strengthen data quality control and calculate variance for each observation. The raw data from all three sensors was collected every second.

Data quality metrics of the PM2.5 data were evaluated for 1 s, 10 s, and 10 min, equivalent to instantaneous readings, train event averaging, and pre-event background conditions, respectively. Prior to evaluation, the data was cleaned to remove aberrant sensor readings. Specifically, values outside two standard deviations were omitted. In all cases, these values were excessively high readings from the low-cost sensor. The observations used in the subsequent statistical analysis ranged from 0 to 117.45 µg m−3 with a median uncertainty of 27%, well within the linearity range of the sensors of < 300 µg3 (Barkjohn et al. 2022).

The custom camera system consists of a microcomputer (Jetson Nano, Nvidia, USA), a camera (NoIR PiCamera, Raspberry Pi), an artificial intelligence (AI) accelerator (Coral Edge TPU, Google, USA), a solid-state hard drive (500 GB T5, Samsung, S. Korea), and an infrared floodlight (IR Illuminator 30 deg, Axton Technologies, USA). The system is placed approximately 60 m from the chosen source and operates autonomously on a continuous basis, except for a daily 30-min period when data is being uploaded to a cloud server (Lightsail, Amazon Web Services, USA).

The camera system is the pivotal technology that enables detection of passing trains. Images from the camera are passed to the computer at 30 frames per second, where they are pre-processed and passed to the AI accelerator. The accelerator is a Tensor Processing Unit (TPU, Coral Edge TPU, Alphabet, USA), which runs an image classification model customized for the monitoring location. This model identifies whether or not a train is present in the image. If so, the computer creates a train event and records: one second before the train was detected, the entire train event, and one second after the train is no longer detected. This recording is saved as an individual train event to an external hard drive. Train speed (meters per second) and the train direction towards or away from the terminal were determined during manual post-processing of the data. Determining object velocity from video recordings is error prone due to variable image processing rates. Instead, train speeds were estimated by using the average frame rate (frames per second) recorded during the monitoring period and fixed observation points in the camera’s field of view. A schematic diagram of the system is presented in Fig. 2.

Fig. 2
figure 2

A schemata of the data collection system

For each 24-h period, data was aggregated from all three data sources and standardized into one second observations for each measurement parameter including meteorology, PM2.5 concentrations and train detection. During this < 30-min period, the monitoring systems were disabled and the data file along with associated video files were uploaded to the cloud server. The data aggregation and upload period were scheduled in the early morning hours when train activity was determined to be consistently absent. Files located in the cloud were retrieved at the user’s convenience for post-processing, which consisted of associating particulate matter and meteorological data with the observed train events based on the shared data timestamps. Accurate date and time determinations were ensured by consistent internet connection and verification by the operating system. Further detail on the derivation of the variables used in our analysis is provided in Appendix A.

Data management

The PM2.5 during train passage was recorded in one-second concentrations and averaged for the roughly 4 to 5 min of passage (longer for freight trains). In addition to the PM2.5 average during the passage of the train, the maximum 10-s average concentrations during the train passage were also recorded and analyzed in order to compare with previous studies.

To determine the change in PM2.5 due to passing trains, we quantified the difference between the measured PM2.5 at the rail site and a “control” period of exposure. The control, also considered a “pre-exposure” period, corresponded to the period just prior to a train’s passage, allowing capture of ambient PM2.5 without the train’s contribution as well as controlling for normal diurnal and regional changes in PM2.5 concentrations. A generally similar approach was used by previous studies (Jaffe et al. 2015; Akaoka et al. 2017). We also established a gap between the control period and the train passage to ensure that particles influenced by the high-pressure zone in front of an oncoming train would not be included in the control.

To select the duration of the control exposure and this gap immediately before the train passage, we examined several alternatives including: a five-minute average ending with a 2- minute gap before the train (5/2) as well as 3/2, 5/5, 10/2, 10/5 and 10/10. Ultimately, the results were insensitive to the alternative control and gap periods, so only the results using 10/10 are reported below.


We addressed several issues including whether full or empty coal cars contribute to local ambient PM2.5 concentrations and, if so, by how much. We also compared the impacts of coal cars relative to those of both freight and passenger trains. Using multiple linear regression with the change in PM2.5 concentration as the dependent variable, our model included binary variables for each of the four train types (passenger, freight, empty and full coal) and examined and controlled for potential confounders. For example, a previous study found a strong association between PM2.5 from coal trains and the effective wind speed (the sum of train and wind speed) (Jaffe et al. 2015). To test the sensitivity of our results to the model specification, we examined the impact of several covariates including train speed, wind speed, effective wind speed, duration of exposure (based on the elapsed time of the train passing), average temperature, dewpoint and relative humidity. The inclusion of humidity served to control for the potential impact of the hygroscopic property of fine particles when measured with optical sensors. We ran the model without a constant term, which facilitated the direct comparison of the impact among the train types. The model results were identical to a model that adds a constant term and drops one of the train types to avoid multi-collinearity.

Additional sensitivity analysis included examining the impact of converting the negative values for the change in PM2.5 into zero values. The negative change from the control period could be a result of significant dust from activities at the monitor’s residential location, dust from trains occurring in the control period, or from a sudden change in wind speed or direction prior to the train arrival. We also considered subsets of certain covariates. For example, we examined those days where wind was below the mean level of 3.1 mph since these calmer periods may relate to higher concentration at the nearby monitor, whereas particles may disperse to a larger area under other wind conditions. Finally, we tested a model where the air quality sensor was calibrated and directly corrected for relative humidity using the closest Federal Equivalency Method (FEM) monitor to our site. This monitor was located in nearby San Pablo (Air Quality System Site ID: 06–013-1004), 1.6 km from our site and generated the following fit, with an R2 of 0.58:


where PM2.5_C is the calibrated and corrected concentration of PM2.5, and PM_PA is the original reading at the train site. In addition to the average change in PM2.5 (difference of PM2.5 during train passage and the control), the maximum (10 s average) concentration relative to the control period was analyzed to compare with findings of previous studies.


Ultimately, during the six-month observation period, the increases in ambient PM2.5 concentrations were measured during the passage of four different train types. Complete data were available for full coal trains (n = 15), empty coal trains (n = 14), freight trains (n = 568) and passenger trains (n = 2235) as identified by the video recordings from the camera system described above. There were some significant differences between characteristics of the train types (see Appendix B for detailed summary statistics). For example, focusing on freight trains versus full coal trains, the mean duration (in seconds) and speed (m/s) for the former were 236 and 18.3, versus 144 and 12.5 for the latter. At the other extreme, the means of these same parameters for passenger trains were 2.2 s and 31.7 m/s.

The results for the basic regression model are presented in Table 1. As expected, wind and train speed were both statistically significant. In addition, the passage of an empty coal car contributed about 2.3 µg/m3 (95% CI = -0.28, 4.82; p < 0.1) to the ambient air, while freight and full coal cars contributed 4.5 µg/m3 (95% CI = 3.82, 5.18; p < 0.01) and 6.8 µg/m3 (95% CI = 4.34, 9.24; p < 0.01). Controlling for the direction of the freight train did not alter the results. This finding indicated that the regression coefficients of these three train types (freight and full/empty coal) were statistically significant from zero and also statistically different from each other. In contrast, the PM2.5 increment from passenger trains was relatively small and not significantly different from zero, so it was not included in the sensitivity analyses below. The amount of explained variation from the basic model was relatively low at 16%.

Table 1 Regression Results for Basic Model

The findings of the sensitivity analysis are displayed in Table 2. Given the null impact of passenger trains, further results for this mode were not included. Model (1) reproduces the results of the basic model. Model (2) added the average temperature during the one-hour average that included the train passage, and resulted in increases in the PM2.5 impact for all three train types with empty coal, freight and full coal cars contributing 5.6 µg/m3 (95% CI = 2.5, 8.7), 7.5 µg/m3 (95% CI = 5.8. 9.2) and 9.7 µg/m3 (95% CI = 6.8, 12.6), respectively. All were statistically significant with p < 0.01. Model (3) indicates the impact of adding humidity which resulted in reductions of approximately 2.5 µg/m3 from the basic case. Model (4) again includes humidity but assigns a zero value when the change in PM2.5 was negative. This adjustment slightly increased the PM2.5 contribution of all of the train types. Model (5) adds a control for dewpoint, a combination of temperature and humidity, which resulted in an increase in the change in PM2.5 from the basic model, while in Model (6) observations are restricted to those occurring during calm wind conditions (less than the mean of 3.0 mph). This constraint significantly increased the contribution of coal trains to ambient PM2.5 to 12.1 µg/m3 (95% CI = 7.7, 16.5; p < 0.01) versus 5.1 µg/m3 (95% CI = 3.8, 6.4; p < 0.01) for freight cars. Finally, Model (7) uses the data from the calibrated PM2.5 concentrations and generated statistically significant estimates of 8.3 µg/m3 (95% CI = 6.4, 10.3; p < 0.01) and 6.5 µg/m3 (95% CI = 6.0, 7.1; p < 0.01), respectively, for full coal and freight trains. Models (1) through (6) each exhibited modest R2 less than 0.19. However, the calibrated Model (7), which provided a robust correction for humidity, explained 53% of the variation in the change in PM2.5. Additional model specifications of Model (7) with covariates used in the earlier models such as train duration, effective wind speed or quadratic terms failed to improve the model fit.

Table 2 Regression Results of the Increase in Average PM2.5 (µg/m3) in Alternative Models

Table 3 displays the regression results for the increase in peak (10 s average) PM2.5 concentrations above the control concentrations during the passing of full coal cars (n = 18), empty coal cars (n = 16) and freight cars (n = 653). Results for passenger trains were not included since few of these trains had durations that were 10 seconds or more. The model specifications were similar to those used in the previous analyses and included wind speed, train speed and the 3 train types. Given the above findings, we focused on 3 different models: a basic model (Model 1), a model corrected and calibrated for humidity as above (Model 2), and the calibrated model under calm wind conditions defined as average wind less than the mean (Model 3).

Table 3 Regression Results of the Increase in Peak PM2.5 (µg/m3) in Alternative Models

For the basic model, the results indicated an increment in maximum PM2.5 over the control period of 22.9 µg/m3 (95% CI = 8.1, 37.5); p < 0.01) for full coal trains. For the model calibrated and corrected for humidity, the increment from coal cars was 17 µg/m3 (95% CI = 6.2, 28.5; p < 0.01) while the corresponding change in PM2.5 was 14.1 µg/m3 (95% CI = 7.9, 20.2; p < 0.01) for freight trains and 9.3 µg/m3 (95% CI = -3.0, 21.5, NS) for empty coal cars. Under calm wind conditions, the impact from coal cars increased to almost 20 µg/m3 (95% CI = 3.4, 36.6; p < 0.05) while the freight increment did not change from the previous case.


Our results indicate that the average change from passing coal trains adds approximately 8.32 µg/m3 (95% CI = 6.37, 10.28; p < 0.01) to the ambient PM2.5, with a range of midpoint estimates, based on the sensitivity analysis, of 5 to 12 µg/m3. These results also suggest that full coal cars contribute approximately to 2 to 3 μg/m3 of PM2.5 more than freight trains observed in our Richmond, California sample. Strikingly, with very calm winds, the nearby concentrations from coal trains were about 12 μg/m3 versus 5.1 for freight trains. This suggests the possibility of our study underestimating the emissions and overall impact of dust from coal trains, since on windier days the dust may simply be dispersed over a wider region beyond our monitoring site. We also observed that unloaded coal cars tended to add 2 μg/m3 of PM2.5 to the existing ambient concentrations, with a range from our sensitivity analysis of from about one (non-significant) to over 5 µg/m3. Regarding peak (10 s) concentrations of PM2.5, the calibrated model indicated an increase of 17.4 µg/m3 (95% CI = 6.2, 28.5) from coal trains which tended to contribute about 3.5 µg/m3 more than freight trains across the models examined. Calm wind conditions resulted in an increase from coal trains of 20 µg/m3 (95% CI = 3.4, 36.6; p < 0.01).

Given the known bias of humidity on optical PM monitors, in addition to controlling for humidity and dewpoint directly in the model specification, a regression model was estimated using data calibrated and corrected for humidity using a nearby FEM monitor (Barkjohn et al. 2021). It is well established that mass calibrations of optical sensors are temporally and spatially dependent on particle optical characteristics (Dubovik et al. 2002; Bond and Bergstrom 2006). The assumption here is that consistent calibration factors from monitors within the same geographic region and time period are reasonable surrogates for in situ calibration.

There are only a few previous studies that have measured PM2.5 concentrations from coal trains. One study examined coal and freight trains passing through the rural Columbia River Gorge (Washington) in the summer of 2014 (Jaffe et al. 2015). The study examined the difference between the 10 s maximum PM2.5 and the background concentration. The authors observed a doubling in peak concentration for coal trains (20.9 µg/m3) versus freight trains (10.7 µg/m3). This is consistent with our results for coal trains using a similar averaging time of 17.4 µg/m3. The average effective wind speeds in the Jaffe study were much higher than those in our study and were often associated with very high concentrations of PM2.5. This suggests that PM2.5 concentrations associated with train passage are likely to be even greater in certain areas farther away from the City of Richmond’s urban setting due to greater train speeds.

A previous study collected data on coal trains operating in the Fraser River Delta area of British Columbia, Canada. In comparing ambient air impacts of the coal trains (n = 20) to background concentrations, the results suggested an increase of 5.3 (a 54% increase over background), 4.1, and 2.6 µg/m3, respectively, for PM3 (comparable to PM2.5), PM10, and PM20, with occasional spikes in PM3 from coal trains to 100 µg/m3 (Akaoka et al. 2017).

Another study collected data on a single day from four monitors located at varied distances from the train line on full (n = 10) and empty (n =11) coal trains heading to and from the Port of Newcastle in New South Wales, Australia (Higginbotham et al. 2013). For full coal cars, there were increases of 2.9 and 7.2 µg/m3, respectively for PM2.5 and PM10 and 7.1 and 18.9 for empty coal cars. Higher impacts for empty coal cars were also reported in studies by Katestone Environmental Pty Ltd (2013).

Finally, Ryan and Wand (2014) analyzed the impacts of freight, empty coal and full coal trains in the Hunter Valley in New South Wales, Australia (Ryan and Wand 2014). The crude (unadjusted) increases in PM2.5 for passing freight, empty coal and full coal cars was 0.53, 1.13 and 1.20 µg/m3, respectively; all statistically significant differences from baseline levels. Their measurements indicated that particulate level concentrations were elevated not only during but also prior to and especially after a train’s passing.

Most of the dust from coal trains occurs from the rail car (80%), with spilled coal (9%) and door leakage (6%) being other sources (Connell Hatch 2008). A consequence is coal dust deposition, with studies finding that, on average, coal composed 6—25% of deposited dust in rail corridors, although Akaoka et al. reports up to 90% in local dust (Akaoka et al. 2017; DSITIA 2015). Evidence indicates that particulate matter from coal trains, storage and open mines can disperse at least 500 m from the source (Trivedi et al. 2009; Akaoka et al. 2017; Srivastava et al. 2021; Sahu and Pakra 2022).

To put our results into perspective, the current U.S.EPA 24-h and annual average standards are 12 and 25 μg/m3, respectively, while the World Health Organization guidelines for the same averaging times are 5 and 10 μg/m3 (U.S. EPA 2019; World Health Organization 2021). In addition, both U.S.EPA and WHO indicate that there is no threshold or safe level for ambient PM2.5. Therefore, a hypothetical three coal trains per week in an urban area could represent an important increase in PM2.5 to nearby residents. Incremental concentrations would subsequently increase the risk of a wide range of health effects including: premature mortality, cardiovascular and respiratory hospitalization or urgent care visits, increases in or exacerbation of asthma, adverse birth outcomes (e.g., low birth weight, prematurity, birth defects and neurodevelopment), possible neurological impacts in children and adults (autism, Alzheimer’s, Parkinson’s) as well as functional impacts such as days with respiratory symptoms, restricted activity, and work or school loss (WHO 2021). As noted above, even acute PM2.5 exposures as short as one hour (or a few hours) can increase the risk of adverse health outcomes, including: acute myocardial infarction, hospitalization and emergency department visits for cardiovascular and respiratory disease, ambulance calls and asthma exacerbation (Yorifuji et al. 2014; Kim et al. 2015; Chen et al. 2019).

Our study has several advantages including the development of an AI-based platform for precise identification of train types during the day or night; real time measurement of PM2.5 and meteorology; siting of a monitor with only the trains as a source of PM2.5; and the ability to produce data on train direction and speed. There were also some shortcomings in our study. There was only a small number of full and unloaded coal cars due to the reduction in economic activity during the COVID-19 pandemic and related supply chain issues. There was only a single monitor to measure the impact of passing trains. This was due to both logistical constraints pursuant to the COVID-19 pandemic and the difficulty in finding monitor host sites that were not impacted by other PM2.5 pollution sources in Richmond, a city transected by major highways, refineries, other heavy industry and a port. There is the possibility of exposure misclassification if some of the freight trains also included coal cars. The low R2 in some of the regression models could be due to several factors including the assignment of hourly wind, temperature and humidity to the 4–5 min of train passage and uncertainty in estimating train speed and length. There were also unmeasured factors such as train weight and number of engines. Finally, it is important to note that our analysis did not include measurements of either ultrafine (particles less than 0.1 micron) or coarse particles (PM10) which will always be generated from the passing trains. Since there is substantial evidence of adverse health effects from both of these particle sizes, the actual health risks posed by passing coal trains are clearly underestimated in this present study (Adar et al. 2014; Ostro et al. 2015).

Identifying the source of fugitive dust is important in part because the implications of exposure extend beyond individual and population health effects to matters of environmental and racial justice (Mikati et al. 2018). While coal dust can have far-ranging population exposures, the communities in relatively close proximity to the rail lines will be disproportionately exposed. These residents are more likely to be of lower-income or people of color (or both) and also more vulnerable to adverse health outcomes (Hricko et al. 2014; Jha and Muller 2017).

Finally, the impacts of the rail transport of coal are compounding because it involves traversing thousands of kilometers, meaning multiple environmental justice communities are impacted. Ecosystems such as rivers and coastlines also receive extended exposure as the rails often trace their contours. Further, the climate change implications of coal transport, storage and handling are significant, ultimately resulting in up to 16% of US carbon pollution (Meyer 2019).


In this paper, we have reported evidence of significant increases in PM2.5 due to passing coal-carrying trains in Richmond, California. The observed increases were greater than those produced by freight trains and passenger trains. Unloaded coal cars also generated increases in PM2.5, but at lower concentrations than full coal cars. Quantifying the contribution of coal trains in urban air populations is important since vulnerable communities are typically found in close proximity to rail lines. In addition, inevitable dispersion of PM2.5 will increase population exposure over a much wider area. Since shipment of coal by train occurs throughout the world and for many urban areas, it represents a significant public health hazard. Finally, to overcome technical challenges that have historically been barriers to the study of coal trains, we developed an artificial intelligence-driven monitoring platform to detect and quantify air pollution from passing trains. These advancements will contribute to future studies of health effects from mobile sources.