Abstract
In the research, in this paper, we investigate spatial and temporal variations in the composition of wastewater near Croatian highways in three climatic regions (continental, Mediterranean, highland) during three seasons (autumn, winter and spring). In our paper, the spatial division of the investigated areas that pertain to the three aforementioned climatic regions was obtained using the method of hierarchical clustering of monitored locations. One thousand five hundred thirty-three samples from 14 locations along Croatian highways were collected and analysed by methods of multivariate exploratory analysis. By methods of principal components, factor analysis and hierarchical clustering of variables, we grouped the variables into factors. Whereas 60 % of variation in the data was explained by three principal components, six principal components accounted for 88 % of data variation. The key section of our research was conducted by the decision tree method. For the purpose of analysis, we classified 1,533 samples into three classes representing climatic regions separately for each season and obtained the accuracy of 76–90 % on test samples. Finally, using decision trees, we identified the most important variables that differentiate climatic regions by the level of contamination of water along highways in different seasons.
This is a preview of subscription content, access via your institution.






References
Ahmad, S. S., & Erum, S. (2010). Integrated assessment of heavy metals pollution along motorway M-2. Soil & Environment, 29, 110–116.
Akbal, F., Gurel, L., Bahadir, T., Guler, I., Bakan, G., & Buyukgungor, H. (2010). Multivariate statistical techniques for the assessment of surface water quality at the mid-black sea coast of Turkey. Water, Air, and Soil Pollution, 216, 21–37.
Amoako, J., Karikari, A. Y., & Ansa-Asare, O. D. (2011). Physico-chemical quality of boreholes in Densu Basin of Ghana. Applied Water Science, 1, 41–48.
Astel, A., Biziuk, M., Przyjazny, A., & Namiesnik, J. (2006). Chemometrics in monitoring spatial and temporal variations in drinking water quality. Water Research, 40, 1706–1716.
Bakirdere, S., & Yaman, M. (2008). Determination of lead, cadmium and copper in roadside soil and plants in Elazig, Turkey. Environmental Monitoring and Assessment, 136, 401–410.
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1998). Classification and regression trees. New York: Chapman & Hall.
Brumelis, G., Lapina, L., Nikodemus, O., & Taboros, G. (2000). Use of an artificial model of monitoring data to aid interpretation of principal components analysis. Environmental Modelling & Software, 15, 755–763.
Chen, K., Jiao, J. J., Huang, J., & Huang, R. (2007). Multivariate statistical evaluation of trace elements in groundwater in a coastal area in Shenzhen, China. Environmental Pollution, 147, 771–780.
Debeljak, M., & Dzeroski, S. (2011). Decision trees in ecological modelling. In F. Japp, H. Reuter, & B. Breckling (Eds.), Modelling complex ecological dynamic, an introduction into ecological modelling (pp. 197–209). Berlin Heidelberg: Springer-Verlang.
Dobsa, J., Praus, P., Kumar Cherukuri, A., & Praks, P. (2012). Classification of hydrochemical data in reduced dimensional space. Journal of Information and Organizational Sciences, 36(1), 27–37.
Dzeroski, S., Grbovic, J., & Walley, W. J. (1997). Machine learning applications in biological classification of river quality. In R. S. Michalski, I. Bratko, & M. Kubat (Eds.), Machine learning and data mining: Methods and applications (pp. 429–448). New York: Wiley.
Dzeroski, S., Demsar, D., & Grbovic, J. (2000). Predicting chemical parameters of river quality from bio-indicator data. Applied Intelligence, 13(1), 7–17.
Facchinelli, A., Sacchi, E., & Mallen, L. (2001). Multivariate statistical and GIS-based approach to identify heavy metal sources in soils. Environmental Pollution, 114, 313–324.
Fay, L., & Shi, X. (2012). Environmental impact of chemicals for snow and ice control: state of the knowledge. Water, Air, and Soil Pollution, 223, 2751–2770.
Grd, D., Dobsa, J., Simunic-Meznaric, V., & Tompic, T. (2012). Analysis of heavy metals concentration along highways in Croatia. Journal of Computing and Information Technology, 20(3), 209–215.
Helena, B., Pardo, R., Vega, M., Barrado, E. E., Fernandez, J. M., & Fernndez, L. (2000). Temporal evaluation of groundwater composition in an alluvial aquifer (Pisuerga river, Spain) by principal component analysis. Water Research, 34(3), 807–816.
Johnson, R. A., & Wichern, D. W. (1988). Applied multivariate statistical analysis (2nd ed.). New Jersey: Prentice Hall. Prentice Hall series in statistics.
Juahir, H., Zain, S. M., Yusoff, M. K., Hanidza, T. I. T., Mohdarmi, A. S., Toriman, M. E., & Mokhtar, M. (2011). Spatial water quality assessment of Langat river basin (Malaysia) using environmetric techniques. Environmental Monitoring and Assessment, 173, 625–641.
Kim, E. J., Herrera, J. E., Huggins, D., Braam, J., & Koshowski, S. (2011). Effect of pH on the concentration of lead and trace contaminants in drinking water. A combined batch, pipe loop and sentinel home study. Water Research, 45, 2763–2774.
Kolo, B. G., Ogugbuaja, V. O., & Dauda, M. (2010). Seasonal variation in dissolved oxygen and organic pollution indicators of Lake Chad basin area of Borno State, Nigeria. Continental Journal of Water, Air and Soil Pollution, 1, 1–5.
Leung, C. M., & Jiao, J. J. (2006). Heavy metal and trace element distribution in groundwater in natural slopes and highly urbanized spaces in Mid-Levels area, Hong Kong. Water Research, 40, 753–767.
Liu, C. W., Lin, K. H., & Kuo, Y. M. (2003). Application of factor analysis in the assessment of groundwater quality in a black food disease area in Taiwan. Science of the Total Environment, 313, 77–89.
Manly, B. F. J. (1986). Multivariate statistical methods. New York: Chapman & Hall.
Mitchell, T. M. (1997). Machine learning. New York: McGraw-Hill.
Morrice, J. A., Danz, N. P., Regal, R. R., Kelly, J. R., Niemi, G. J., Reavie, E. D., Hollenhorst, T., Axler, R. P., Trebitz, A. S., Cotter, A. M., & Peterson, G. S. (2008). Human influences on water quality in great lakes coastal wetlands. Environmental Management, 41, 347–357.
Pinto, U., & Maheswari, B. L. (2011). River health assessment in peri-urban landscapes: an application of multivariate analysis to identify the key variables. Water Research, 45, 3915–3924.
Plesnicar, A., & Zupancic, N. (2005). Heavy metal contamination of roadside soil along Ljubljana–Obrezje highway. Materials and Geoenvironment, 52, 403–418.
Reghunath, R., Murthy, S., & Raghavan, B. R. (2002). The utility of multivariate statistical techniques in hydrochemical studies: an example from Karnataka, India. Water Research, 36, 2437–2442.
Rencher, A. C. (2002). Methods of multivariate analysis (2nd ed.). New York: Wiley. Wiley series in probability and statistics.
Saeedi, M., Hosseinzadeh, M., Jamshidi, A., & Pajooheshfar, S. P. (2009). Assessment of heavy metals contamination and leaching characteristics in highway side soils, Iran. Environmental Monitoring and Assessment, 151, 213–241.
Shrestha, S., & Kazama, F. (2007). Assessment of surface water quality using multivariate statistical techniques: a case study of the Fuji river basin, Japan. Environmental Modelling & Software, 22, 464–475.
Singh, K. P., Malik, A., Mohan, D., & Sinha, S. (2004). Multivariate statistical techniques for the evaluation of spatial and temporal variations in water quality of Gomti River (India)—a case study. Water Research, 38, 3980–3992.
Smeti, E. M., Thanasoulies, N. C., Lytras, E. S., Tzoumerkas, P. C., & Goulfinopoulos, S. K. (2009). Treated water quality assurance and description of distribution networks by multivariate chemometrics. Water Research, 43, 4676–4684.
Soares, H. M. V. M., Boaventura, R. A. R., Macheado, A. A. S. C., & Esteves da Silva, J. G. G. (1999). Sediments as monitors of heavy metal contamination in the Ave river basin (Portugal): multivariate analysis of data. Environmental Pollution, 105, 311–323.
Viera, J. S., Pires, J. C. M., Martins, F. G., Vilar, V. J. P., Boaventura, R. A. R., & Botelho, C. M. S. (2012). Surface water quality assessment of Lis river using multivariate statistical methods. Water, Air, and Soil Pollution, 223, 5549–5561.
Yang, Q., Shao, J., Scholz, M., & Plant, C. (2011). Feature selection methods for characterizing and classifying adaptive sustainable flood retention basins. Water Research, 45, 993–1004.
Zhou, F., Liu, Y., & Guo, H. (2007). Application of multivariate statistical methods to water quality assessment of the watercourses in northwestern New Territories, Hong Kong. Environmental Monitoring and Assessment, 132, 1–13.
Acknowledgments
We would like to thank Hrvatske Autoceste (Croatian Highways) for their permission allowance to use data for the purpose of scientific research.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dobsa, J., Meznaric, V., Tompic, T. et al. Evaluation of Spatial and Temporal Variation in Water Contamination Along Croatian Highways by Multivariate Exploratory Analysis. Water Air Soil Pollut 225, 2083 (2014). https://doi.org/10.1007/s11270-014-2083-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11270-014-2083-x
Keywords
- Highway
- Water contamination
- Hierarchical clustering
- Principal components
- Factor analysis
- Decision tree