Intercomparison of Sonde, WRF/CAMx and Satellite Sounder Profile Data for the Paso Del Norte Region

The Paso Del Norte (PdN) region comprises the city of El Paso, TX, Ciudad Juarez, Mexico, and some neighboring cities in the state of New Mexico. Developing a regional weather model for this specific region has always been challenging due to its complex terrain. To obtain more accurate weather and pollution forecasting for the PdN region, the results of the downscaled WRF (Weather Research and Forecast) model were intercompared with meteorological satellite data, with ground and radiosonde dataset. In addition, it is critical to analyze the distributions of ozone concentrations to better understand atmospheric aerosol concentrations and predict them both more accurately. Hence, in this study the ozone results of CAMx (Comprehensive Air Quality Model with Extensions) were extensively intercompared with ozonesonde data. The radiosonde/ozonesonde data were obtained throughout a campaign conducted during the summer of 2017 in the PdN region. Different meteorological variables such as temperature, pressure, relative humidity, wind speed, and ozone concentrations were used for comparison at several locations in the PdN region. The TCEQ (Texas Commission of Environment Quality) data from different CAMS (Continuous Ambient Monitoring Stations) were used for ground data intercomparison with the WRF results. The meteorological satellite sounding data were retrieved using an in-house satellite antenna receiver. The results of this research paper will not only provide better pollution forecasting capability for the PdN region but also for other regions with similar topography and terrain.


Introduction
Cities occupy less than 0.1% of the earth's total surface; however, half of the world's entire population inhabits cities (Lee et al. 2011). The Paso Del Norte (PdN) region is a binational metropolitan region on the border of Mexico and the United States. This region is centered on two large cities: El Paso from the United States and the City of Juarez in Mexico. In addition to these two cities, some counties from New Mexico State (USA) are also included in this region (Fig. 1). The PdN is regarded as having the second largest metropolitan area in the USA, and the largest bilingual and binational work force in the western hemisphere (Philips 2010).
A unique geopolitical location characterizes the Paso Del Norte region. This region is comprised of three counties in southwestern Texas and southern New Mexico, United States, and the municipality of Ciudad Juarez in the northern part of the state of Chihuahua of Mexico (Garfin and Leroy 2018). The Rio Grande separates the two largest cities, El Paso and Ciudad Juarez, which are connected by five land bridges (Collins et al. 2009). The Paso Del Norte region has unique meteorological and topographical conditions. El Paso, which is intersected by the Franklin Mountains, contains the Kilbourne's Maar Volcanic peaks and it is surrounded by the Chihuahua desert. The most well-known feature of the area is the Rio Grande River which divides El Paso, US from Juarez, Mexico. This binational river flows through three US states, Texas, New Mexico and Colorado and skirts the southern end of the Franklin Mountains, West of Juarez and El Paso (Garcia et al. 2004).
The regional climatic conditions are hot and dry for most part of the year. There are air quality issues pertaining to the regional weather along with high emissions from automobiles and industrial activities. This region in particular is frequently affected by the ozone and particulate matter (PM) pollution. Both ozone and PM have adverse health effects on humans, and therefore the accurate prediction and forecasting of pollutants is an essential prerequisite for the proper implementation of State Regulations concerning the air quality for the region.
The tropospheric ozone formation and aerosols concentrations share much of the same physics and chemistry. Ozone is formed through photochemical reactions of nitrogen oxides (NO x = NO + NO 2 ) and volatile organic compounds (VOCs; Stockwell et al. 2012). These reactions produce atmospheric acids such as nitric acid, sulfuric acid and organic acids and these acids are key aerosol precursors (Stewart et al. 2019). Ozone may react with organic compounds such as isoprene and other alkenes to produce organic compounds with low volatility that condense to produce secondary organic aerosol (SOA; Stockwell et al. 2019). Ozone and particulate matter concentrations are related to the same meteorological factors that determine the vertical structure of the atmosphere. The vertical structure of the atmosphere is also tightly connected to both ozone and particulate matter aerosol concentrations because of the vertical structure's effect on atmospheric stability (Calvert et al. 2015). We believe that more accurate prediction of the meteorological variables and ozone concentrations will contribute significantly to better prediction of aerosol concentrations.
Several air quality studies have been conducted in the PdN region in the past (Macdonald et al. 2001;Brown et al. 2001;Hicks et al. 2015;Mahmud et al. 2016;Karle et al. 2017a, b;Stewart et al. 2019;Karle et al. 2018Karle et al. , 2019Karle et al. , 2020. Based on the 1996 ozone study campaign (Macdonald et al. 2001), several research articles had been published (Lu et al. 2008;Pearson and Fitzgerald 2001;Pearson et al. 2007;Stockwell et al. 2013). However, comparisons of vertical profiles of different meteorological components and ozone data with corresponding experimental data have not been performed before in this region. Previously, global and local atmospheric chemistry models such as the Community Multistate Air Quality (CMAQ) or the Comprehensive Air Quality Model Extensions (CAMx) were used in this region to calculate the effects of emission on global oxidizing capacities and develop ozone abatement strategies (Ngan et al. 2013;Mahmud 2016;Mahmud et al. 2016).
In our work, a careful selection of the physics schemes of the WRF model was used for this region and validated against the local data obtained from the Texas Commission of Environment quality (TCEQ's) ground observational monitoring stations. Furthermore, the vertical simulation results were compared with radiosonde data retrieved during the El Paso Campaign during the summer of 2017, and with data from the Metop-B satellite sounder profile using an in-house satellite-antenna receiver. In addition, the Eulerian photochemical dispersion model CAMx (Comprehensive Air Quality Model with Extensions), has been used to simulate the ozone episodes corresponding to the same period, and compared against the ozonsesonde data obtained from the El Paso sonde campaign in 2017. This will permit gaining more insight and to increase the forecasting capability.
WRF simulations for the summer of 2017 were performed choosing a variety of days that involved high ozone, low ozone, high-temperature, and low temperature cases. Different meteorological variables such as temperature, wind speed, relative humidity, and pressure were analyzed  (Baumbach et al. 2008) for these selected time periods for four different locations within the Paso Del Norte region. In addition, the diurnal variation of these parameters was examined throughout the summer of 2017. Furthermore, an intercomparison of vertical profiles of ozone concentration using ozonesondes and CAMx' results was performed. Subsequently, statistical tests, such as Correlation Coefficient, Median Absolute Deviation (MAD), Mean Square Error (MSE) and Root Mean Square Error (RMSE) were performed to assess the accuracy of the model results. Finally, the average values of the temperature and pressure of the WRF simulations, the radiosonde values, and the Metop-B sounder profile satellite values were intercompared.

WRF Simulation
For this study, the WRF version 3.9.1 released by NCAR (National Center of Atmospheric Research) was used. The WRF model is configured with three domains for this simulation. The outer domain has a 172 × 172 mesh with a horizontal resolution of 36 km. The intermediate domain which has a horizontal resolution of 12 km consist of 172 × 172 resolution as well. The inner domain which is the smallest domain with a spatial resolution of 4 km and also has 172 × 172 mesh grid (Fig. 2). For the WRF simulation, we used 7 days spin-up run for each day of simulation.
The outer domain, which is denoted by d01 covered several states of US, including Texas, New Mexico, Arizona, Colorado as well as the northern part of Mexico. At the same time, the 2nd domain (d02) covered mostly southeast part of Texas, some parts of Juarez city, and some counties of New Mexico state as well. The smallest domain or d03 focused on the Paso Del Norte region, which is the region of interest for this paper Karle et al. 2020). Domain 2 and Domain 3 were a two-way nested domain. The performance of the WRF model depends on the choice of suitable physics schemes. It is necessary to identify the best physics options for a specific region, depending on the geographical, topographical, and seasonal characteristics of synoptic and thermo-dynamical features . In this paper, a comprehensive study was conducted of the different physics schemes and the best schemes were selected for this region. The Global Forecast System (GFS) is a weather forecast model produced by the National Centers for Environmental Prediction (NCEP). For the initial boundary condition, Global forecast system analysis data with 0.5-degree spatial resolution was used (Yahya et al. 2014). The simulation ran with a 6-h interval resolution for different days. For the Planetary boundary layer scheme, the first order closure scheme known as the Yonsei University (YSU) (Hong et al. 2006) method was chosen. For the vertical profiles, 35 vertical sigma levels were used where the bottom layers at 1.0 eta level and top level is 0.0. The physical parameterization used in the current simulations are as follows: WSM or WRF single moment (Hong et al. 2004) was used for microphysics option, and Unified Noah Land surface Scheme (Tewari et al. 2004) was used for the land surface option. For surface layer option, we used Eta Similarity Scheme (Monin and Obukhov 1954) and for Cumulus option, we used the Kain-Fritsch scheme (Kain 2004).

CAMx Simulation
An Euler ian photochemical dispersion model, Comprehensive Air Quality Model with Extensions (CAMx), was used to simulate ozone during the El Paso sonde campaign studies that took place on June 2017 over the PdN region. The version of CAMx V6.1 was used (Environ 2011). The CAMx model requires a meteorological model to produce meteorological fields and an emissions processing system. The emissions were processed with the Sparse Matrix Operator Kernel Emissions (SMOKE) model (Houyoux and Vukovich 1999). The SMOKE model was used to convert the source-level emissions (total county emissions) reported every year to model-ready emissions that are spatially resolved, hourly and aggregated into model species. The above mentioned meteorological model Fig. 2 The nested domain configuration used for the WRF simulations. The coarse, middle and fine domains have spatial resolutions of 36-, 12-and 4-km respectively 1 3 output (WRF) was used for the meteorological background with hourly intervals. The emission inventory used in this study is the US Environmental Protection Agency's (EPA) National Emission Inventory released originally in 2015 (NEI15). Since the modeling domain includes both USA and Mexico the latest released Mexico emission dataset (Wolf et al. 2009), which provides for six northern border-states of Mexico, has also been obtained as the supplementation for NEI15 and used in the simulation.
The CAMx model is also run over a three-nested domain configuration with 36-, 12-and 4-km resolutions for coarse, middle and fine domains respectively. The WRF output is converted to a format that SMOKE and CAMx can read and during this process, the WRF vertical layers are collapsed into 24 levels to alleviate the computational costs. However, the 15 segments within the PBL are unchanged to maintain high resolution at elevations where emission and chemical reactions of pollutants occur. All three CAMx grids possessed identical vertical layer structures spanning the entire troposphere and lower stratosphere up to a pressure altitude of 100 mb. For each ozone case, the CAMx model was run for ten consecutive days where the first nine simulated days were treated as a spin-up period. The boundary conditions (BCs) of the coarse domain (36-km) simulation were extracted from a MOZART (Model for Ozone and Related chemical Tracers) global chemistry model (GCM) simulation of 2017 (Lee-Taylor and Madronich 2002). For the first day simulation of each case, initial condition was obtained from MOZART model where MOZART output species were interpolated from the MOZART horizontal and vertical coordinate system to the CAMx LCP coordinate system, and vertical layer structure and the MOZART chemical species were mapped to the chemical mechanism used by CAMx (Brasseur et al. 1998). Initial and boundary conditions for each 12-and 4-km simulations are subsequently extracted from the CAMx 36 km simulation results on an hourly basis. For the warm start-up run (e.g., cycle running), the simulation results of the previous day are used to produce initial and boundary conditions.

Radiosonde and Ozonesonde Launching
Sondes are packages which are attached to the weather balloon and allowed to rise through the atmosphere to sample data at frequent intervals. Sondes usually reach the height of 30 km in the atmosphere, depending on the size of the balloon. For the current study, two types of weather balloons were used, one with a weight of 600 g and the other with a load of 350 g. These balloons reached up to the average height of 30 km and 20 km, respectively.
The radiosonde and ozonesonde data were obtained from a campaign called Tropospheric Ozone Pollution Project which took place during the summer of 2017, with the collaboration of University of Texas at El Paso, New Mexico State University and St Edwards University. Sixty radiosondes from four different locations in Paso Del Norte region were launched during this campaign. Those radiosondes were built and developed by IMET (International Met Systems) with the capability of extracting different meteorological parameters at different heights (Wierenga et al. 2005). The ozonesondes used in the campaign were built by En-Sci manufacturing company. Locations and timing of the sonde launching are presented in Table 1.
The radiosonde launchings generally took place at mid-day or early afternoon to retrieve the maximum meteorological parameter output (Rappenglück et al. 2008). This timing was also relevant when determining the height of the convective boundary layer. To calibrate the radiosonde data, the launching sites contained a surface observational station which aided in comparison of the data. Details description of the radiosonde and ozonesonde are showed in the following Table 2.

Observational Surface Data
The TCEQ (Texas Commission on Environmental Quality), with the help of EPA (United States Environmental Protection Agency), set up a grid of observational data collection stations throughout the state of Texas (US), these stations are known as CAMS (Continuous Ambient Monitoring Station). CAMS are used for measuring both air and water pollutants across the state of Texas. In addition to measuring air pollutants, CAMS stations also contain instruments to estimate local meteorological surface CAMS also contain equipment that measure ambient gaseous materials and particulate matter, ambient concentration of ozone, carbon monoxide and oxides of nitrogen. Particulate matter is measured in two classifications: PM 10 (less than or equal to 10 microns in aerodynamic diameter) and PM 2.5 (particles with an aerodynamic diameter of 2.5 microns or less) (EPA 2015).
For the current study, that took place in the PdN region, four different locations around this region were chosen for validation purposes (Fig. 3). These locations were significantly different from each other considering an environmental viewpoint.

Metop-B Satellite
Metop-B (Meteorological Operational) is Europe's first polar-orbiting operational meteorological satellite. It was the European contribution to the Initial Joint Polar System (IJPS), a co-operative agreement between European Meteorological satellite (Eumetsat) and the US NOAA (National Oceanic and Atmospheric Administration) to provide data for climate and environmental monitoring and improved weather forecasting (Edwards and Pawlak 2000). Metop-A and Metop-B are currently active, with Metop-C recently launched.
Metop-B spacecraft is the second in a series of three European developed satellites used for weather forecasting and collecting long term data sets for climate records of the Earth. It carries a set of state-of-the-art sounding and  Step and Dwell imaging instruments that offer improved remote sensing capabilities to both meteorologists and climatologists. Among all these instruments, for this study, the legacy ATOVS (Advanced TIROS Operational Vertical Sounder) was used to retrieve different meteorological parameters (Li et al. 2000). ATOVS consists of a High-Resolution Infrared Radiation Sounder (HIRS), the Advanced Microwave Sounding Unit-A (AMSU-A) and AMSU-B for retrieving temperature, humidity, and ozone sounding in all weather conditions (NOAA website 2014). Currently, the ATOVS generates profile data from the NOAA-19, Metop-A, and Metop-B. This instrument package provides information on temperature and humidity profiles, total ozone, clouds and radiation on a global scale to the operational user community (Table 3).
Using an in-house antenna receiver located at the Physics Department, University of Texas El Paso, the data of the Metop-B satellite was extracted in real-time. Metop-B satellite orbits four times every 24 h over Paso Del Norte region. In the daytime, the orbits start at 17:26 MST and end at 09:33 MST, and later during the day, it circles 11:06 MST and ends at 11:12 MST. For nighttime, the satellite has two passing times. The first one starts at 20:43 MST, and the second one on 22:23 MST, ending at 20:50MST and 22:30 MST, correspondingly.

Results and Discussion
The WRF model results were compared against ground observational data from the Texas Commission of Environment Quality's Continues Ambient Monitoring stations. Several representative days were selected for the intercomparison. High ozone days, low ozone days, high temperature, and low-temperature days were among those days selected for the intercomparisons. First, we intercompared the WRF ground temperature results with TCEQ surface temperature observations at four different locations. The selected days were mostly high ozone and high-temperature days from the summer of 2017. Subsequently, we extracted the WRF vertical profile for different meteorological variables at the same locations using NCAR command language (NCL) scripts and then compared them against the corresponding radiosonde data.
The days and locations that were selected were May 15 for the UTEP location, June 06 for the skyline location, June 12 for the Santa Teresa location, and June 22 for the Socorro location. The radiosondes were launched from the same locations and days. The vertical profiles of ozone concentration obtained with the ozonesondes on three different high ozone days were also intercompared with corresponding CAMx' vertical results. Figure 4 exhibits the diurnal variation of temperature at four different locations for four different days. Before and after sunrise the simulated values and observational values are in closer agreement, while at the middle of the daytime, where the temperatures were at maximum, the dissimilarity increased. However, in general there is close agreement between the observational data and the simulation data.  Figure 5 shows an intercomparison between model's results and radiosonde data released from the Socorro location at 13 MST on June 22. Pressure and temperature shows excellent agreement between the observation and model's results; however, there are some discrepancies for relative humidity and wind speed.
The vertical intercomparisons reach a height up to 20 km above ground level (Fig. 6). Similarly, Figs. 7 and 8, which represent two different days (May 15, June 12) at two diverse location (UTEP and Santa Teresa), exhibit the same trend. Figure 9 shows the intercomparison of temperature, relative humidity, and wind speed between the WRF model and the TCEQ CAMS 12 at the UTEP location. Correlation coefficient for those 3 different graphs are 0.95, 0.87 and 0.56 correspondingly. The horizontal axis represents the days for the summer, while the vertical axis shows the values of various meteorological parameters.
To perform the ozone vertical profile intercomparison, we chose three different days at three different locations from our area of study. The ozonesonde measures the ozone concentration from the ground level up to the top level of the troposphere, at least. As it is depicted in Fig. 10, the ozone concentration remains constant up to 5 km above ground level. The ozone concentration value increased after entering the stratosphere layer or at the top part of the troposphere layer. As all of those launchings occurred at midday, the change of the ozone concentration actually indicated the planetary boundary layer heights (Couach et al. 2003) for those days, which were around 5 km correspondingly.
Finally, to intercompare the values of WRF, radiosonde, and Metop-B satellite, the average values for temperature and pressure were calculated for all the locations on selected representative days as it is depicted in Figs. 11 and 12. Upon comparing data from four different TCEQ locations and Balloon launching, we subsequently conducted several statistical tests among the datasets of numerous meteorological parameters. For the vertical profile, we chose the values at 9 altered altitudes, which are at: 1.5, 2, 3, 4, 5, 6, 10, 15 and 20 km, respectively. Statistical measures like correlation coefficient (R), median absolute deviation (MAD), which quantifies the variability of a univariate sample of quantitative data, have been applied to those data sets. Mean square error (MSE) and root mean square error (RMSE) were also used to calculate the difference between the simulated data and the experimental data. Mean absolute percentage error (MAPE), which is a measure of prediction Fig. 6 Meteorological profiles of June 05 with Skyline observational data: a wind speed (m/s), b temperature (C), c pressure (hPa), and d relative humidity (%) accuracy, was also applied to our study. Finally, we computed the Index of agreement and bias error  to see how much those predictions and real values matched in between them.
From the Table 4, temperature and pressure at four different locations show an excellent match with our regional weather model simulation data. The statistical indices of agreement and correlation coefficients showed a strong relationship between observed and simulated values in every single case. On the other hand, the wind speed and relative humidity show lower correlations between simulations and observed values. Especially after crossing the troposphere zone, the deficiency increased for both of those quantities.
The bias adjustment technique is one of the many ways of improving the intercomparisons between observational and satellite data, and numerous studies have been conducted Ngan et al. 2013). Using the stochastic volatility for extreme fluctuations in meteorological time series can be applied in the future (Bhuiyan 2020;Mariani et al. 2018). Choosing the proper grid size for the domain and using spectral nudging is another way of resolving this issue (Liu et al. 2012;Heikkilä et al. 2011). Another newfound approach, using a proper initial condition of the WRF preprocessing system known as FV3, which is applied by the meteorologists around the world (Lin et al. 2016) is under development.

Conclusion
In this paper, the WRF and CAMx model simulation results were presented and tested against corresponding experimental data. In the summer of 2017, a tropospheric study was conducted to obtain vertical meteorological and ozone data throughout the Paso del Norte region. WRF and CAMx simulations were also performed during the summer of 2017, and the results intercompared with the ground and vertical observational data. Different meteorological parameters such as temperature, wind speed, relative humidity, wind direction, pressure were intercompared. The observational ground data from the Texas Commission on Environmental Quality's Continuous Ambient Monitoring station was used. The vertical meteorological and ozone data were intercompared against sonde data. The intercomparison for the temperature and pressure results of the WRF model showed excellent agreement with all the observational data, while the relative humidity and wind showed reasonable agreement. We attribute the minor discrepancies to the fact Fig. 8 Meteorological profiles of June 12 with Santa Teresa observational data: a relative humidity (%), b pressure (hPa), c temperature (C), and d wind speed (m/s) that the model provides averages of the values, even at the smallest 4 km grid size, while the observations are point values. The temperature, for example, is more uniform throughout the 4 km grid, resulting in better agreement with the local point observations. The CAMx model performs well in our simulations; however, some over-predictions are observed at higher altitudes with the ozonesondes data, as the balloon moves horizontally as well as vertically, so the intercomparison cannot be performed at the exact same location, causing the discrepancies.
Subsequently, the same meteorological parameters were obtained using the Metop-B Satellite, which can provide the vertical profile data for those variables (EUMETSAT 2018), and using an in-house satellite antenna, the satellite results were intercompared against the WRF results and radiosonde's data. Several appropriate statistical tests were performed to assess the accuracy. Although the timing of the radiosonde launch and the satellite passing were not accurately synchronized, and a temporal average of the whole datasets was needed to make the intercomparisons, the comparison between the satellite, WRF, and radiosonde showed good agreement.
This study provides valuable insight and direction for future work in the Paso del Norte Region, and similar Southwest regions, particularly in assessing the effect of mountainous terrain, for planetary boundary layer studies, for satellite meteorological data validation, and towards improving the accuracy of air quality and numerical weather prediction model simulations.  observations during the 2017 campaign. The authors will also like to acknowledge the Texas Commission on Environmental Quality (TCEQ) for their financial and intellectual support. Fitzgerald, Lu and Stockwell will also like to express their gratitude to the NOAA Center for Atmospheric Science-Meteorology (NCAS-M), which is funded by the National Atmospheric Administration/Educational Partnership Program under Cooperative Agreement #NA16SEC4810006 for their support.

Compliance with ethical standards
Conflict of interest On behalf of all authors, the corresponding author states that there is no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.