Abstract
Data distribution is usually skewed severely by the presence of hot spots in contaminated sites. This causes difficulties for accurate geostatistical data transformation. Three types of typical normal distribution transformation methods termed the normal score, Johnson, and Box–Cox transformations were applied to compare the effects of spatial interpolation with normal distribution transformation data of benzo(b)fluoranthene in a large-scale coking plant-contaminated site in north China. Three normal transformation methods decreased the skewness and kurtosis of the benzo(b)fluoranthene, and all the transformed data passed the Kolmogorov–Smirnov test threshold. Cross validation showed that Johnson ordinary kriging has a minimum root-mean-square error of 1.17 and a mean error of 0.19, which was more accurate than the other two models. The area with fewer sampling points and that with high levels of contamination showed the largest prediction standard errors based on the Johnson ordinary kriging prediction map. We introduce an ideal normal transformation method prior to geostatistical estimation for severely skewed data, which enhances the reliability of risk estimation and improves the accuracy for determination of remediation boundaries.
Similar content being viewed by others
References
Albuquerque MTD, Antunes IMHR, Seco MFM, Roque NM, Sanz G (2014) Sequential Gaussian simulation of uranium spatial distribution—a transboundary watershed case study. Procedia Earth Pl Sci 8:2–6
Bargaoui ZK, Chebbi A (2009) Comparison of two kriging interpolation methods applied to spatiotemporal rainfall. J Hazard Mater 365(1-2):56–73
Blake WH, Walsh RPD, Reed JM, Barnsley MJ, Smith J (2007) Impacts of landscape remediation on the heavy metal pollution dynamics of a lake surrounded by non-ferrous smelter waste. Environ Pollut 148(1):268–280
Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc B (Methodological) 26:211–252
Campbell JE, Moen JC, Ney RA, Schnoor JL (2008) Comparison of regression coefficient and GIS-based methodologies for regional estimates of forest soil carbon stocks. Environ Pollut 152(2):267–273
Cattle JA, McBratney AB, Minasny B (2002) Kriging method evaluation for assessing the spatial distribution of urban soil lead contamination. J Environ Qual 31(5):1576–1588
Clemente R, Dickinson NM, Lepp NW (2008) Mobility of metals and metalloids in a multi-element contaminated soil 20 years after cessation of the pollution source activity. Environ Pollut 155(2):254–261
Dai L, Wei H, Wang L (2007) Spatial distribution and risk assessment of radionuclides in soils around a coal-fired power plant: a case study from the city of Baoji, China. Environ Res 104(2):201–208
Dragović R, Gajić B, Dragović S, Đorđević M, Đorđević M, Mihailović N, Onjia A (2014) Assessment of the impact of geographical factors on the spatial distribution of heavy metals in soils around the steel production facility in Smederevo (Serbia). J Clean Prod 84:550–562
Franco C, Soares A, Delgado J (2006) Geostatistical modeling of heavy metal contamination in the top soil of Guadiamar river margins using a stochastic simulation technique. Geoderma 136(3-4):852–864
Franssen HJWMH, Eijnsbergen ACV, Stein A (1997) Use of spatial prediction techniques and fuzzy classification for mapping soil pollutants. Geoderma 77(2-4):243–262
Fu W, Tunney H, Zhang C (2010) Spatial variation of soil nutrients in a dairy farm and its implications for site-specific fertilizer application. Soil Till Res 106(2):185–193
Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press, New York, NY, p 483
Goovaerts P, Trinh HT, Demond A, Franzblau A, Garabrant D, Gillespie B (2008) Geostatistical modeling of the spatial distribution of soil dioxins in the vicinity of an incinerator: theory and application to Midland, Michigan. Environ Sci Technol 42(10):3648–3654
Grimalt JO, van Drooge BL, Ribes A, Fernández P, Appleby P (2004) Polycyclic aromatic hydrocarbon composition in soils and sediments of high altitude lakes. Environ Pollut 131:13–24
Heie LS, Meland S, Ljønes M, Salbu B, Strømseng AE (2010) Short-term temporal variations in speciation of Pb, Cu, Zn and Sb in a shooting range runoff stream. Sci Total Environ 408(11):2409–2417
Hill ID, Hill R, Holder RL (1976) Algorithm as 99: fitting Johnson curves by moments. Appl Stat 25(2):180–189
Jobson J (1991) Applied multivariate data analysis: regression and experimental design categorical and multivariate methods. Springer, New York
Juang KW, Lee DY, Ellsworth TR (2001) Using rank order geostatistics for spatial interpolation of highly skewed data in a heavy-metal contaminated site. J Environ Qual 30(3):894–903
Komnitsas K, Modisb K (2009) Geostatistical risk estimation at waste disposal sites in the presence of hot spots. J Hazard Mater 164(2-3):1185–1190
Krige DG, Magri EJ (1982) Studies of the effects of outliers and data transformation on variogram estimates for a base metal and a gold ore body. J Int Assoc Math Geol 14(6):557–564
Liu G, Bi RT, Wang SJ, Li FS, Guo GL (2013) The use of spatial autocorrelation analysis to identify PAHs pollution hotspots at an industrially contaminated site. Environ Monit Assess 185(11):9549–9558
McGrath D, Zhang C, Carton OT (2004) Geostatistical analyses and hazard assessment on soil lead in Silvermines area, Ireland. Environ Pollut 127(2):239–248
Meirvenne MV, Goovaerts P (2001) Evaluating the probability of exceeding a site-specific soil cadmium contamination threshold. Geoderma 102(1-2):75–100
Mostert MMR, Ayoko GA, Kokot S (2012) Multi-criteria ranking and source identification of metals in public playgrounds in Queensland, Australia. Geoderma 173–174:173–183
Mueller TG, Pusuluri NB, Mathias KK, Cornelius PL, Barnhisel RI, Shearer SA (2004) Map quality for ordinary kriging and inverse distance weighted interpolation. Soil Sci Soc Am J 68(6):2042–2047
Orton TG, Rawlins BG, Lark RM (2009) Using measurements close to a detection limit in a geostatistical case study to predict selenium concentration in topsoil. Geoderma 152(3-4):269–282
Park JJ, Shin KI, Cho K (2004) Evaluation of data transformations and validation of a spatial model for spatial dependency of Trialeurodes vaporariorum populations in a cherry tomato greenhouse. J Asia-Pacific Entomol 7(3):289–295
Saby N, Arrouays D, Boulonne L, Jolivet C, Pochot A (2006) Geostatistical assessment of Pb in soil around Paris, France. Sci Total Environ 367(1):212–221
Sante MD, Mazzieri F, Pasqualini E (2009) Assessment of the sanitary and environmental risks posed by a contaminated industrial site. J Hazard Mater 171(1-3):524–534
Senesil GSG, Baldassarre N, Radina SB (1999) Trace element inputs into soils by anthropogenic activities and implications for human health. Chemosphere 39(2):343–377
Slifker JF, Shapiro SS (1980) The Johnson system: selection and parameter estimation. Technometrics 22(2):239–246
USEPA. (2007a). SW-846 method 8270D. Semivolatile organic compounds by gas chromatography/mass spectrometry (GC/MS). http://www.epa.gov/osw/hazard/testmethods/sw846/pdfs/8270d.pdf. Accessed February 2007
USEPA. (2007b) SW-846 method 8272. Parent and alky polycyclic aromatic in sediment pore water by solid-phase microextraction and gas chromatography/mass spectrometry in selected ion monitoring mode. http://www.epa.gov/osw/hazard/testmethods/pdfs/8272.pdf. Accessed December 2007.
Verstraete S, Meirvenne MV (2008) A multi-stage sampling strategy for the delineation of soil pollution in a contaminated brownfield. Environ Pollut 154(2):184–191
Webster R, Oliver MA (2001) Geostatistics for environmental scientists. Wiley, New York, NY, p 271
Wu J, Norvell WA, Welch RM (2006) Kriging on highly skewed data for DTPA-extractable soil Zn with auxiliary information for pH and organic carbon. Geoderma 134(1-2):187–199
Wu C, Wu J, Luo Y, Zhang H, Teng Y, DeGloria SD (2011) Spatial interpolation of severely skewed data with several peak values by the approach integrating kriging and triangular irregular network interpolation. Environ Earth Sci 63(5):1093–1103
Xie Y-F, Chen T-B, Lei M, Zheng G-D, Song B, Li X-Y (2010) Impact of spatial interpolation methods on the estimation of regional soil Cd. Acta Scientiae Circumstantiae 130:847–854 (In Chinese)
Yan W, Mahmood Q, Peng D, Fu W, Chen T, Wang Y, Li S, Chen J, Liu D (2015) The spatial distribution pattern of heavy metals and risk assessment of moso bamboo forest soil around lead–zinc mine in Southeastern China. Soil Till Res 153:120–130
Yuan F, Li XH, Bai XY, Jowitt SM, Zhang MM, Jia C, Zhou TF (2010) Comparison of normalisation methods for non-normal distributed soil geochemical data: a case study from the Tongling metallogenic district, Yangtze belt, Anhui Province, China. Trans Inst Min Metall B 119(4):227–235
Acknowledgments
This research was supported by the Young Scientists Fund of National Natural Science Foundation of China (Grant No. 41401236; 40901249), the Youth Science and Technology Research Foundation of Shanxi, China (Grant No. 2015021166), and the National High Technology Research and Development Program of China (863 Program: 2013AA06A206).
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Marcus Schulz
Rights and permissions
About this article
Cite this article
Liu, G., Niu, J., Zhang, C. et al. Accuracy and uncertainty analysis of soil Bbf spatial distribution estimation at a coking plant-contaminated site based on normalization geostatistical technologies. Environ Sci Pollut Res 22, 20121–20130 (2015). https://doi.org/10.1007/s11356-015-5122-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11356-015-5122-2