1 Introduction

Benzene, after being known as a water and soil pollutant in proximity to crude and refined hydrocarbon storage sites, has been recognized as a relevant environmental issue for urban atmospheres in Europe since the late years of the last century, leading to the issuance of the Second Daughter Directive of the Air Quality Framework 2000/69/EC recently integrated in the 2008/50/EC Directive. In Italy, the Ministry Decree 60/2002 acknowledged the Directive, setting a limit value of 10 μg m−3 to be considered until 31 December 2005 and progressive reduction of limits down to 5 μg m−3 to be considered from January 2010 onwards. Limits were confirmed by the Legislative Decree 155/2010 that actuates the Directive 2008/50/EC known as “Clean Air For Europe.”

Volatile organic compounds usually monitored in urban centers include toluene, ethylbenzene, and xylenes (reported to have harmful effects on the central nervous system (ATSDR 2012)) and benzene, classified by International Agency for Research on Cancer as human carcinogen with pending updates of the assessment (Infante 2011; Cogliano et al. 2011).

For what concerns sources of such atmospheric contaminants, most of the benzene in urban air is emitted as non-exhausted volatile organic compounds (VOCs) from mobile sources, while a smaller amount comes from evaporative losses. EU Petrol Vapors Recovery Directives 94/63/EC and 2009/126/EC regulate the storage and the refueling of vehicles with petrol, with stricter norms having entered into force from January 2012 to reduce emissions of VOCs by recovering the vapor; the Directive 98/70/EC sets the maximum amount of benzene in gasoline to less than 1 %, and this was confirmed by the Directive 2009/30/EC.

Before the enforcement of Directive 98/70/EC, atmospheric average levels of benzene, toluene, ethylbenzene, and xylene (BTEX) in cities were significantly higher than the current ones, negatively contributing to citizen health. For instance, Brocco et al. (1997) reported the yearly average concentration of benzene in Rome (Italy) of 40 and 47 μg m−3 during 1992 and 1993, respectively. Despite general benzene decrease in air, Air Quality Guidelines for Europe from WHO (2000) states that no safe level of exposure can be recommended, since benzene is carcinogenic to humans.

Urban atmospheres are still critical due to the high vehicular traffic combined with high population densities; besides fuel specifications, city management plays a fundamental role in determining VOC pollution levels by defining traffic plans and limitations and positioning of gas stations. This kind of pollution depends also on meteorological variables (seasonal and daily variations), socio-economical variables (daily and hourly traffic intensity and fluidity), and geographical variables (main highways, parking places, land morphology, etc.). Different sources, as the presence of industrial plants (e.g., coke ovens) or fuel tank farms, can play a role, but they are not object of the present work.

Scientific literature reports about studies and monitoring campaigns on BTEX in air conducted using passive samplers, highlighting factors affecting spatial and temporal patterns in cities and towns. Pekey and Yilmaz (2011) rationalized BTEX concentrations in the industrial city of Kocaely, Turkey. Caselli et al. (2010) reported about two weekly campaigns in the city of Bari, Italy, while Sturaro et al. (2010) studied correlations between BTEX and phenols during 2007 at two sites in the city of Padua, Italy. Roukos et al. (2009, 2011) studied different meteorological situations that occurred in 2007 in Dunkerque (France) and related VOC concentrations to ozone and long-range transport of pollutants. Bruno et al. (2008) did daily campaigns in the town of Canosa di Puglia, Italy, while Hoque et al. (2008) described the situation of Delhi (India) with respect to traffic, as it was done by Monteiro Martins et al. (2007) for Rio de Janeiro, Brasil. Khoder (2007) performed a short-time study at Cairo, Egypt. Simon et al. (2004) described spring and summer campaigns near Toulouse (France) in 1999 and 2001, before and after the enforcement of Directive 98/70/EC about benzene content in gasolines. More recent studies focused on the relationship between ambient air and indoor environment (Chen et al. 2011; Hun et al. 2011; Esplugues et al. 2010; Johnson et al. 2010; Zielinska et al. 2012. Some BTEX levels determined during the European monitoring campaigns mentioned above and described in other references are summarized in Table 1.

Table 1 BTEX concentrations determined during exemplary monitoring campaigns in European countries

Despite the literature review, no studies appear to report about multi-year trends of BTEX. In 2001, in consideration of the co-presence of potentially critical sources of VOCs such as traffic, steelworks with coke oven, and harbor activities related to crude oil pumping into a pipeline close to the urban center, the Regional Environmental Protection Agency of Friuli Venezia Giulia (ARPA-FVG) started a diffuse monitoring program in the coastal city of Trieste (Italy), with a quasi-monthly frequency in 21 sites by means of passive samplers. The city counts almost 210,000 inhabitants, and it faces the northernmost part of the Adriatic Sea. The monitoring program in 2012 is still running.

An assessment of the air quality requires consistent and periodical monitoring of pollution indicators as well as other parameters such as meteorological data, which are useful for interpreting pollutant dynamics. This large amount of data can be interpreted using chemometrics and more specifically time series analysis (Astel et al. 2008), multiway principal component analysis (Astel et al. 2010a), and receptor-oriented models (Astel et al. 2010b; Astel 2010). However, most of the techniques have their own requirements as data continuity for time series analysis. To solve data discontinuity problem, as often happens with long-term monitoring projects, the use of unique procedures becomes necessary. One of them could be a self-consistent iterative procedure called expectation maximization (Stanimirova et al. 2007, the other an application of chemometric algorithms which are relatively resistant for missing elements, i.e., the self-organizing maps.

The aim of the study was to model a discontinuous long-term monitoring data set on BTEX level using self-organizing maps to explore spatiotemporal variations of these pollutants after the lowering of benzene levels in gasoline and the removal of fuel stations from city centers. Additional aims were to determine whether the levels of BTEXs depend on the geographical location of sampling stations (being a direct consequence of traffic intensity) and to investigate the relations between basic meteorological parameters (wind speed and temperature) and BTEX levels measured in the atmosphere of the city of Trieste.

2 Experimental

2.1 Sampling Sites and Sampling Procedure

Air samples were collected at 21 fixed stations spread across the city of Trieste, Italy, forming an irregular grid of approximately 5-km long from north to south and 1.5-km wide from east to west. This rectangular disposition reflects the geographical features of the city, which develops longitudinally between the Adriatic Sea coast (west) and the highlands (east). Fixed stations were located both in the city center and in its northern and southern peripherals, dividing the entire city area into several subareas according to their location (Fig. 1).

Fig. 1
figure 1

Passive sample location map

The urban area located in the west of the city center has not been included in the sampling protocol since it is characterized by inhomogeneous features (generally low traffic, hilly residential zone located immediately in the western direction with respect to the city center, fluid traffic highways located along the coast line, and harbor activities near the sea). The explanations of the district abbreviations and their urban classification presented in Fig. 1 are depicted in Table 2.

Table 2 An explanation of the codes of the streets according to the sampling grid located in the city of Trieste

Pollution sources in the area of interest are mainly due to vehicular traffic and, during cold seasons, domestic heating. Moreover, a steelworks with coke oven is located on the sea shore near the monitoring station of Baiamonti Str., while the proper industrial zone develops in the southern direction (out of the map) with respect to the monitored area. According to vehicular traffic, the city center is mainly impacted by small private vehicles (cars and motorcycles) and public transportation (busses and taxis). During the last few years, the access to most part of the downtown and semi-peripheral areas has been progressively restricted to vehicles featuring more advanced anti-pollution devices, meeting more demanding standards of emission. The effect of this strategy has been a significant decrease of air pollution levels (e.g., benzene concentration decrease (Fig. 2)), in spite of the general trend of increasing number of circulating vehicles and their increasing use, due to the reduced prices of fuel established by local administrators according to the proximity of the Slovenian border, where fuel is cheaper than in Italy since less taxed, as a rule.

Fig. 2
figure 2

An annual average benzene concentration in air (μg m−3) in the city of Trieste in the period between 2001 and 2008 (maps acquired with Golden Surfer v.8.0; interpolation algorithm used inverse distance to a power (power = 2)

Local worsening of the air quality reflected by an increase of benzene concentration in 2006 could be attributed to extended road works in the street connecting Garibaldi Square and Corso Italia, and consequent difficulties for vehicle circulation in this area and the neighboring ones were due to alternative traffic paths.

Air samples were collected in the period between January 2001 and December 2008 with Radiello RAD 130 (Sigma-Aldrich, USA) for BTEX sampling, according to the procedure given by the supplier of Radiello (Radiello 2003; Cocheo et al. 2009). Samplers have been mounted on appropriate standings located at 2.5–3.0 m above the ground level. Samplers that have been exposed for 3 weeks, from the 1st to the 21th day of every month, were then stored in the laboratory at +4 °C. Two milliliters of carbon disulfide (free from benzene) was added to the Radiello glass tube. Tubes were mixed manually and left for 30 min; then, they were mixed again and left for another 30 min. The solution of CS2 containing the extracted hydrocarbons was transferred to a crimp-capped 2-mL glass vial mounted with a PTFE cap.

2.2 Determination of Analytes

Analysis was performed by means of a Clarus 500 gas chromatograph (Perkin-Elmer, USA) provided with an autosampler. The columns used were a SGE BPX35 (length 30 m, diameter 0.25 mm, film thickness 0.25 μm) or equivalent. One microliter of the sample was injected with split technique through a 235 °C traditional injector, while the oven temperature has been ramped from 35 to 265 °C with an 8 °C min−1 rate. A FID detector was used. Calibration has been performed by analyzing five standard solutions each containing 1, 2, 5, 10, and 20 mg L−1 of benzene, ethylbenzene, o-xylene, m-xylene, p-xylene, and 4, 8, 20, 40, and 80 mg L−1 of toluene. Method performances were controlled by injecting periodically (every 20 injections) a blank and two standard solutions. The method had been previously validated through an intercalibration test among different laboratories (Villalta et al. 2004). The detection limit was lower than 0.05 mg L−1. Collected data were converted to air concentration (μg m−3) according to the instructions given by the supplier of Radiello (Radiello 2003), considering time and average temperature of exposition.

2.3 Meteorological Data

Trieste climate is Mediterranean, characterized by episodes of a strong katabatic ENE wind known as Bora; during the coldest month (January), average temperature approaches 6 °C, while in the warmest month (July), it is slightly higher than 24 °C. Meteorological data are provided by the meteorological service of ARPA-FVG, managing a synoptical station and several micrometeorological stations. Average data for temperature (°C) and wind speed (m/s) registered by the seven automatic air quality and meteorological stations located inside the sampling area are reported in Table 3.

Table 3 Basic statistics concerning BTEX concentrations (μg m−3) and meteorological parameters according to sampling site

2.4 Data Analysis Procedure

Various statistical multivariate techniques are commonly used for air pollution modeling and assessment (Astel et al. 2008, 2010; Astel 2010a, b). In the present study, the self-organizing map (SOM) algorithm, as one of the most efficient neural network architectures for solving problems in the field of exploratory data analysis, clustering, and data visualization, was applied. The SOM approach was originally proposed by Kohonen (1984); it is a model that implements a characteristic nonlinear projection from the high-dimensional space of objects onto a low-dimensional array of neurons. Theoretical background of the SOM approach can be found elsewhere (Kohonen 1984, 1995; Kohonen et al. 1996; Vesanto 2000; Vesanto et al. 2000; Giraudel and Lek 2001; Fort 2006; Cottrell et al. 1998; Ultsch and Siemon 1990) and this is why only major advantages related to both exceptional visualization, dimension reduction, and exploration abilities are briefly presented below.

2.4.1 General Idea

The SOM algorithm, much like conventional ordination methods, shares the basic idea of displaying a high-dimensional signal manifold onto a much lower dimensional network in an orderly fashion (usually 2D space). An SOM produces virtual communities in a low-dimensional lattice through an unsupervised learning process. The term “self-organizing” refers to the ability to learn and organize information without being given the associated dependent output values for the input pattern (Mukherjee 1997). An SOM is noise tolerant; this property is highly desirable when site-measured data are used. The advantages of the SOM in relation to conventional ordination methods as principal component analysis and cluster analysis are discussed elsewhere (Astel et al. 2007; Zhang et al. 2008).

2.4.2 Kohonen’s Neural Network Map Description

The SOM consists of neurons organized on a regular, usually two-dimensional grid. The neurons are connected to adjacent neurons by a neighborhood relation, which dictates the topology, or structure of the map and thus similar, from the chemical point of view, air samples should be mapped close together on a grid. In time iterative training, an algorithm constructs grid nodes in the SOM in order to represent the whole data set, and their weights are optimized. In our study, the batch training algorithm was applied (Ripley 1996; Vesanto et al. 2000; Sarkisian 2010). When an SOM net has been trained, the map needs to be evaluated to find out if it has been optimally trained or if further training is required. The number of grid nodes to be used in an SOM analysis may vary from a few dozen up to several thousand and can be considered as a trade-off between representation accuracy. A small number of grid nodes will result in a high quantization error (QE, scaled as 1–0) and well-defined clusters, while a large number of grid nodes result in a low quantization error and, in the most extreme case, a cluster for each data sample. In other words, the lower the quantization error, the higher the number of clusters to be interpreted. Quantization error being traditionally related to all forms of vector quantization and clustering algorithms is a measure which completely disregards map topology and alignment. The QE is computed by determining the averaged distance of the sample vector to the cluster centroids by which they are represented. In case of the SOM, the cluster centroids are the prototype vectors. In order to assess the SOM topology, preservation is quantified and represented by a topographic error (TE) (Kohonen 2001). It is the most simple of the topology preservation measures. The lower the topographic error (close to 0), the better the SOM which preserves the topology (Arsuaga and Díaz 2005). TE is computed as follows: for all data samples, the respective best and second best matching units are determined. If these are not adjacent on the map lattice, this is considered an error. The total error is then normalized to a range from 0 to 1, where 0, as mentioned above, means perfect topology preservation. Unfortunately, there are no reference values of QE and TE. They strongly depend on the map resolution. According to Peeters et al. (2006), quantization error rapidly decreases with increasing number of grid nodes, while topographic error converges to a stable value. As suggested by Vesanto et al. (2000), the most satisfactory compromise between clustering ability, quantization error, and topographic error could be achieved when the number of nodes (n) is determined using the following formula: \( n=5\cdot \sqrt{{\mathrm{number}\,\,\mathrm{of}\,\,\mathrm{samples}}} \).

2.4.3 Clustering

Once the SOM has converged, the weight vectors of the chemical and meteorological variables were fed into a non-hierarchical K-means algorithm in order to find clusters in the neurons of the SOM. Partitioning by K-means implies the user to decide the final number of K groups (or clusters) the algorithm will converge into. Different numbers of predefined clusters were tried, and finally, the seven cluster pattern with the lowest Davies–Bouldin (DB) index (a function of the ratio of the sum of within-cluster scatter and between-cluster separation) (Davies and Bouldin 1979) was chosen. In order to assess the significance of the clustering pattern, non-parametric tests such as the Kruskal–Wallis and Dunn’s tests were used; however, the Kruskal–Wallis test results were not discussed since it only confirms or denies an existence of a difference with respect to a given variable among more than two populations (in our case clusters). When adequate confirmation was obtained, Dunn’s test was consecutively calculated. All calculations in this study were performed using Matlab 2007 (MathWorks, Inc.), Prism 5 Trial Version (GraphPad Software, Inc.), and Statistica 10.0 (Statsoft, Inc.) software running on a Windows VISTA platform.

3 Results and Discussion

Monthly average concentrations of benzene, toluene, ethylbenzene, xylene, as well as wind speed and temperature values are shown as basic statistics in Table 2. Besides the mean value, minimum (Min), and maximum (Max), standard deviation (S.D.), variability coefficient (VC), and skewness coefficient (SC) were also depicted in Table 2. VC allows comparison of the standard deviations of different variables that are in different units of measure. This is useful when comparing measurements across multiple heterogeneous data sets or across multiple measurements taken on the same data set. It is calculated according to the formula \( \mathrm{VC}=\frac{\mathrm{S}.\mathrm{D}.}{\overline{x}} \), while skewness coefficient is a measure which describes asymmetry from the normal distribution in a set of statistical data: negative skewness indicates left-skewed data distribution; while positive, right-skewed. The skewness coefficient values presented in this study were calculated according to the formula presented by Doane and Seward (2011): \( \mathrm{SC}=\frac{n}{(n-1)(n-2)}\sum\limits_{i=1}^n {{{{(\frac{{{x_i}-\overline{x}}}{\mathrm{S}.\mathrm{D}.})}}^3}} \), where n is the sample size. Considering benzene and toluene mean concentrations, Garibaldi, Canova, Battisti, F. Severo, Corso Italia, d’Alviano streets and Dalmazia square are qualified as the most polluted sites, with a mean higher than 5 and 28 μg m−3 of benzene and toluene, respectively. All highly polluted sites are located in the city center or widespread downtown (Figs. 1 and 2). As a result from SC values in majority of the locations, strong (toluene and wind) or moderate (benzene, ethylbenzene, and xylene) data skewness is observed and the application of the most common linear methods for statistical analysis could not be sufficient. In order to analyze the spatial and temporal variations of BTEX concentration in air in the city of Trieste, the data (consisting of 10,368 analytical and meteorological results) were arranged in a two-way array of constant dimensionality of six variables (benzene, toluene, ethylbenzene, and xylene (μg m−3), temperature (°C), and wind speed (m s−1)) and 1,728 air samples collected at 21 sampling sites. When the concentration of a given analyte was below the detection limit, the replacement code “NaN” (not a number) was used instead of the missing value in the data set due to software requirements, as the self-organizing map algorithm implemented in the Matlab environment as a SOM Toolbox has the ability to deal with missing data. If an observation is missing a value on a specific variable, and hence coded as NaN, that variable is omitted from the distance calculation for that observation (Sarkisian 2010). As a result, NaNs are neither treated as zero nor lack of sample. Replacement code NaN was put instead of the missing value for benzene (0.06 % of total samples), toluene (0.29 %), ethylbenzene (1.1 %), and xylene (0.23 %). The total number of Kohonen’s map neurons was estimated as \( n=5{\cdot \sqrt{1,728 }}\approx 208 \). Such estimated number of map neurons could be obtained by more than one network dimension and this is why several of them were considered at this stage. In order to efficiently use the full space of the SOM, considered dimensions were limited and varied between 15 and 29 for vertical and between 7 and 14 for horizontal. In light of a number of possible combinations of final dimensions which gives the estimated number of neurons (i.e., 29 × 7 = 203, 26 × 8 = 208, 23 × 9 = 207, 21 × 10 = 210, 19 × 11 = 209, 17 × 12 = 204, 16 × 13 = 208, and 15 × 14 = 210), QE and TE were calculated in all cases. Finally, the chosen dimensionality of 19 × 11 was characterized by the lowest values of quantization and topographic errors, 0.520 and 0.043, respectively. Moreover, a hexagonal lattice was preferred because it does not favor either horizontal or vertical direction (Kohonen 2001). Once the SOM’s grid had been optimized, the individual variable planes were visualized (Fig. 3).

Fig. 3
figure 3

SOM planes for meteorological variables as well as BTEXs determined in air samples

Each variable has its own component plane. This representation usually is a slice through the weights vector, scaling the highest (or most positive) weight to 1 and the lowest (or least positive) to 0 for each of the variables. In our case, component planes were scaled to represent the range of changeability of a given parameter. Analyzing those planes, we concluded that high concentration values of the majority of BTEXs are located in the middle and left parts of the upper area, while high contents of toluene and low content of benzene, ethylbenzene, and xylene are located in the right part of the upper area. Simultaneously, relatively well-organized distribution of temperature’s plane suggested seasonality. Taking into consideration the important information from the SOM theory that each node of the SOM map could be consecutively referred to one or more air samples, it could be summarized that the differentiated structure of BTEX abundance (reflected in different color scales in the planes) suggested the existence of several similarity clusters in the set of samples.

As a consequence mentioned above, weight vectors of the converged map were clustered based on a K-means clustering mode. Different predefined numbers of clusters (K) were tried, and the sum of squares for each run was calculated. The best partition was obtained for a seven-cluster configuration having the lowest DB index value (Fig. 4). The numbers plotted within the map cells show how many times the weights of a given map neuron resembled the most of the input vector. The node with a weight vector closest to the input vector is tagged as the best matching unit, and the number of tagging is summarized. Finally, the distribution of the sample vectors along a Kohonen map can be studied by decoding the best matching unit selection events. Setting up the cluster composition with codes of the streets and names of the months (Fig. 5) allowed assessment of the spatial and temporal variations of BTEXs in the city of Trieste. The differences in BTEX concentration in air samples clustered as CI–CVII, presented in Fig. 6, were examined by Dunn’s multiple comparison test (Table 4).

Fig. 4
figure 4

Clustering pattern according to the Davies–Bouldin index minimum value

Fig. 5
figure 5

Cluster composition according to the quantity of samples collected in a given location and in a given month

Fig. 6
figure 6

Benzene, toluene, ethylbenzene, and xylene levels as well as temperature and wind speed values according to clustering pattern (central horizontal line median, box 25–75 % percentile, whiskers minimum–maximum)

Table 4 Statistical analysis (Dunn’s test) in the combination of clusters obtained using SOM approach

Its results show that in the case of more than 70 % comparisons per variable, the difference in rank sum was statistically important with the domination of p level value lower than 0.001. Most often, the differences were discovered between CI, CIII, and CIV and other clusters. Almost total lack of differences between CII and other clusters were discovered for benzene, ethylbenzene, and xylene. CV, CVI, and CVII did not differ one another, while BTEXs were considered. It might be concluded that the clusters are not homogeneous. Among the seven clusters, cluster I includes 179 samples (10.36 %) with the highest benzene, ethylbenzene, and xylene concentrations. Most of the samples included in CI were collected at Garibaldi Sqr. (42), Battisti (32) and Canova (26) Streets and Corso Italia (22), which create a compact city center. These data show how the downtown area of Trieste looks the most polluted by BTEXs part. There is a lack of the seasonal pattern in this case since the contribution of samples collected in successive months is comparable (median contribution to the cluster’s population is 13.5 samples per month). A temporal assessment on the whole span of years proves consecutive improvement of the air quality in the city center: 86 % of the samples of CI were uniformly collected in the period between January 2001 and December 2004, while only 14 % in the period between January 2005 and December 2008. This phenomenon, being in agreement with Fig. 2, proves that the strongest pollution events were common before 2004 and that the air quality in the city of Trieste significantly increases. As a result from the comparison of variable planes (Fig. 3), high abundance of BTEXs was discovered in the downtown area during the meteorological condition characterized by low wind speed (<0.93 m s−1) both at low and high temperatures.

Cluster II consists of only 13 (0.75 %) samples collected in December 2002 with a concentration of toluene exceeding 150 μg m−3. The small size of the CII and the fact that all other pollutants were determined on relatively low levels indicate that the samples belonging to CII represent accidental pollution by toluene, mainly limited to the sampling sites belonging to compact downtown area (Battisti Str. (BAT), Borsa square (BOR), Gallina Str. (GAL), Corso Italia (ITA), Canova Str. (CAN), and Garibaldi Str. (GAR)) or widespread downtown (F. Severo Str. (SEV), Dalmazia square (DAL), Mioni Ave. (MIO), Foraggi square (FOR), and dell’Istria Str. (IST)). The reason of this accidental increase in toluene concentration could be related to street floor painting for road indications.

Clusters III (375 samples, 21.70 %) and IV (271 samples, 15.68 %) consist of samples of moderate BTEX concentrations. Spatial analysis indicates that most of the samples in both cases were collected in the downtown (GAR (23), CAN (30), BAT (26), BOR (23), GAL (22), and ITA (24)) or in the widespread downtown (SEV (21), DAL (22), ALV (21), and ROS (21)) of the city, while temporal assessment proves an existence of a clear seasonal pattern. Samples clustered in CIII were mainly collected in summer; while in CIV, in winter time. Sampling stations located in the southern suburban areas (Monte S. Pantaleone (PAN), Carpineto Str. (CAR), and Pitacco Str. (PIT)) were not included to CIII and CIV or appeared occasionally. The only exception from the rule described above is VAL, which even if being located southerly from downtown still appears in CIII and CIV. The reason of this phenomenon might be related to the traffic caused by the proximity of the stadium. Higher median (Table 2) and mean (Fig. 6) concentrations of BTEXs determined in winter than in summer time can be explained with the concurrence of several factors as seasonality in emissions and in atmospheric oxidants that lower BTEX as well as with lower height of mixed layer provoking higher pollutant concentrations during cold months. For the pluriennal assessment, 68 and 64 % of the samples derived respectively from CIII and CIV were collected in the period between January 2001 and December 2004, while the rest (32 and 36 %) were collected after January 2005. The relative prevalence of data collected before 2005 also in the moderately polluted clusters CIII and CIV confirms the improvement of the air quality in the city. Average monthly concentrations in the whole city of Trieste for the different BTEXs within the 8 years are reported in Fig. 7 in order to support this observation.

Fig. 7
figure 7

Long-term trend of BTEX concentration in air of the city of Trieste

The slope coefficient of the time series is negative for each pollutant confirming definitively the trend of air quality improvement. Similarly to cluster composition described above, CV (425 samples, 24.60 %) and CVI (443 samples, 25.63 %) reflect to clear seasonal pattern, and in both cases, they consist of samples showing the lowest concentrations of BTEXs. In agreement with the expectations, most of the samples were generally collected in the northern (COM, BEL, and LIB) and southern (PAN, CAR, and PIT) Trieste. The relatively high contribution of samples collected after 2005 in CV and CVI (68 and 67 %, respectively), a significant part of which are collected also in downtown, again proves continues improvement of air quality in the city.

The last cluster CVII consists of 22 samples (1.27 %), and almost all of them (20) were collected in July 2007 when very strong wind occurred occasionally (17.1 m s−1). An isolation of the CVII is mainly due to extraordinary meteorological condition.

4 Conclusions

A multi-year study on spatial and temporal variation of BTEX concentrations in an urban environment is reported. More than 1,700 samples, consisting of passive samples each exposed for 3 weeks in 21 sites in 8 years, were collected in the city of Trieste (Italy) and analyzed for BTEX concentrations together with the meteorological data providing a relevant base of 10,368 data useful for the assessment of urban VOC evolution. The method used in this research allowed, by application of one chemometric technique, for a logical spatiotemporal grouping of air quality samples collected in the city of Trieste. The SOM modeling approach demonstrated to be adequate for managing such an extended data collection, characterized by relative numerosity and some missing data. Interpretable clusterization of data and meaningful data representation are provided.

A partition of urban areas/zonation based on air quality parameters is provided, and a change in time is shown, making the most polluted clusters/situations to disappear in recent years. This evidence has probably some co-determining factors (Nuvolone et al. 2009); the changes in private and public vehicular fleets, urban management measures as the ban of most of fueling stations from city center, and the limitation for the access to the city center imposed to vehicles with the highest emission had reasonably played a role. The zonation based on similarities revealed by passive samplers positioned at different locations can provide hints for optimization of the air quality monitoring network equipped with more expensive wired automatic instruments.

Results indicate that during the considered timeframe accordingly to the sampling scheme, downtown Trieste results the most polluted area among those considered. The level of pollution—and hence hazard for humans—decreases together with an increase of the distance between particular location and city center. Clustering based on self-organizing allowed to discover both long- and short-term (seasonality) changes in the air quality in the city of Trieste. An existence of seasonality in BTEX atmospheric concentrations was confirmed by higher abundance of benzene, toluene, ethylbenzene, and xylene in winter than in summer time, pointing at the importance of dispersion associated to the expansion of the mixing layer during the warm season.

The considered monitoring and data analysis scheme allows an effective assessment of the evolution of pollution mitigation associated with traffic impacts. The negative slope coefficient of the BTEX concentration time series confirmed definitively the tendency of air quality improvement in the area of interest, showing the experimental effects of EU Directives aimed at the reduction of BTEX concentrations in air.