Skip to main content
Log in

Predicting dissolved oxygen concentration using kernel regression modeling approaches with nonlinear hydro-chemical data

  • Published:
Environmental Monitoring and Assessment Aims and scope Submit manuscript

Abstract

Kernel function-based regression models were constructed and applied to a nonlinear hydro-chemical dataset pertaining to surface water for predicting the dissolved oxygen levels. Initial features were selected using nonlinear approach. Nonlinearity in the data was tested using BDS statistics, which revealed the data with nonlinear structure. Kernel ridge regression, kernel principal component regression, kernel partial least squares regression, and support vector regression models were developed using the Gaussian kernel function and their generalization and predictive abilities were compared in terms of several statistical parameters. Model parameters were optimized using the cross-validation procedure. The proposed kernel regression methods successfully captured the nonlinear features of the original data by transforming it to a high dimensional feature space using the kernel function. Performance of all the kernel-based modeling methods used here were comparable both in terms of predictive and generalization abilities. Values of the performance criteria parameters suggested for the adequacy of the constructed models to fit the nonlinear data and their good predictive capabilities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Anoruo, E. (2011). Testing for linear and nonlinear causality between crude oil price changes and stock market returns. International Journal of Economic Sciences and Applied Research, 4, 75–92.

    Google Scholar 

  • Arlot, S., & Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistical Survey, 4, 40–79.

    Article  Google Scholar 

  • Basant, N., Gupta, S., Malik, A., & Singh, K. P. (2010). Linear and nonlinear modeling for simultaneous prediction of dissolved oxygen and biochemical oxygen demand of the surface water—a case study. Chemometrics and Intelligent Laboratory Systems, 104, 172–180.

    Article  CAS  Google Scholar 

  • Brock, W. A., Dechert, W., Scheinkman, J. A., & LeBaron, B. (1996). A test for independence based on the correlation dimension. Econometric Reviews, 15, 197–235.

    Article  Google Scholar 

  • Cao, D. S., Liang, Y. Z., Xu, Q. S., Hu, Q. N., Zhang, L. X., & Fu, G. H. (2011). Exploring nonlinear relationships in chemical data using kernel-based methods. Chemometrics and Intelligent Laboratory Systems, 107, 106–115.

    Article  CAS  Google Scholar 

  • Chapra, S. (1997). Surface water-quality modeling. New York: McGraw Hill Companies Inc.

    Google Scholar 

  • Chen, W.-B., & Liu, W.-C. (2013). Artificial neural network modeling of dissolved oxygen in reservoir. Environmental Monitoring and Assessment. doi:10.1007/s10661-013-3450-6.

    Google Scholar 

  • Cherkassky, V., & Ma, Y. (2004). Practical selection of SVM parameters and noise estimation for SVM regression. Neural Networks, 17, 113–126.

    Article  Google Scholar 

  • Chu, C., Ni, Y., Tan, G., Saunders, C. J., & Ashburner, J. (2011). Kernel regression for FMRI pattern prediction. NeuroImage, 56, 662–673.

    Article  Google Scholar 

  • Cortes, C., Mohari, M., Weston, J. (2005). A general regression technique for learning transductions. Proceedings of the 22 nd International Conference on Machine Learning, Bonn, Germany

  • Cozzolino, D., Cynkar, W. U., Shah, N., & Smith, P. (2011). Feasibility study on the use of attenuated total reflectance mid-infrared for analysis of compositional parameters in wine. Food Research International, 44, 181–186.

    Article  CAS  Google Scholar 

  • Cristianini, N., Taylor, J.S. (2000). An Introduction to Support Vector Machine and other Kernel based Learning Methods. Cambridge, Cambridge University Press

  • Daszykowski, M., Semeels, S., Kaczmarck, K., Van Espen, P., Croux, C., & Walczak, B. (2007). TOMCAT: A MATLAB toolbox for multivariate calibration techniques. Chemometrics and Intelligent Laboratory Systems, 85, 269–277.

    Article  CAS  Google Scholar 

  • Ekinci, S., Celebi, U. B., Bal, M., Amasyali, M. F., & Boyaci, U. K. (2011). Predictions of oil/chemical tanker main design parameters using computational intelligence techniques. Applied Soft Computing, 11, 2356–2366.

    Article  Google Scholar 

  • Evrendilek, F., & Karakaya, N. (2013). Monitoring diel dissolved oxygen dynamics through integrating wavelet denoising and temporal neural networks. Environmental Monitoring and Assessment. doi:10.1007/s10661-013-3476-9.

    Google Scholar 

  • Heddam, S. (2013). Modeling hourly dissolved oxygen concentration (DO) using two different adaptive neuro-fuzzy inferencesystems (ANFIS): a comparative study. Environmental Monitoring and Assessment. doi:10.1007/s10661-013-3402-1.

    Google Scholar 

  • Hsu, C.W., Chang, C.C. (2003). A practical guide to support vector classification. http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf.

  • Jade, A. M., Srikanth, B., Jayaraman, V., Kulkurani, B. D., Jog, J. P., & Priya, L. (2003). Feature extraction and denoising using kernel PCA. Chemical Engineering Science, 58, 4441–4448.

    Article  CAS  Google Scholar 

  • Jemwa, G. T., & Aldrich, C. (2005). Monitoring of an industrial liquid–liquid extraction system with kernel-based methods. Hydrometallurgy, 78, 41–51.

    Article  CAS  Google Scholar 

  • Kim, K. I., Franz, M. O., & Scholkopf, B. (2005). Iterative kernel principal component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 1351–1366.

    Article  Google Scholar 

  • Kramer, R. (1998). Chemometric techniques for quantitative analysis (pp. 173–180). Sharon: CRC Press.

    Book  Google Scholar 

  • Li, H., Liang, Y., & Xu, Q. (2009). Support vector machine and its application in chemistry. Chemometrics and Intelligent Laboratory Systems, 95, 188–198.

    Article  CAS  Google Scholar 

  • Lin, F. C., Moschetti, M. P., & Ritzwoller, M. H. (2008). Surface wave tomography of the western United States from ambient seismic noise: Rayleigh and Love wave phase velocity maps. Geophysics Journal International, 173, 281–298.

    Article  Google Scholar 

  • Mattera, D., & Haykin, S. (1999). Support vector machines for dynamic reconstruction of a chaotic system. In B. Scholkopf, J. Burges, & A. Smola (Eds.), Advances in kernel methods: support vector machine. Cambridge, MA: MIT Press.

    Google Scholar 

  • Naik, V. K., & Manjapp, S. (2010). Prediction of dissolved oxygen through mathematical modeling. International Journal of Environmental Research, 4, 153–160.

    CAS  Google Scholar 

  • Ngo, S. H., Kemeny, S., & Deak, A. (2004). Application of ridge regression when the model is inherently imperfect: a case study of phase equilibrium. Chemometrics and Intelligent Laboratory Systems, 72, 185–194.

    Article  CAS  Google Scholar 

  • Noori, R., Karbassi, A. R., Moghaddamnia, K., Han, D., Zokaei-Ashtiani, M. H., Farokhnia, A., et al. (2011). Assessment of input variables determination on the SVM model performance using PCA, Gamma test, and forward selection techniques for monthly stream flow prediction. Journal of Hydrology, 401, 177–189.

    Article  Google Scholar 

  • Pagnini, G. (2009). The kernel method to compute the intensity of segregation for reactive pollutants: mathematical formulation. Atmospheric Environment, 43, 3691–3698.

    Article  CAS  Google Scholar 

  • Pan, Y., Jiang, J., Wang, R., Cao, H., & Cui, Y. (2008). Advantages of support vector machine in QSPR studies for predicting auto-ignition temperatures of organic compounds. Chemometrics and Intelligent Laboratory Systems, 92, 169–178.

    Article  CAS  Google Scholar 

  • Postama, G. J., Krooshof, P. W. T., & Buydens, L. M. C. (2011). Opening the kernel of kernel partial least squares and support vector machines. Analytica Chimica Acta, 705, 123–134.

    Article  CAS  Google Scholar 

  • Rosipal, R., & Trejo, L. J. (2001). Kernel partial least squares in reproducing kernel Hilbert space. Journal of Machine Learning Research, 2, 97–123.

    Google Scholar 

  • Rosipal, R., Girolami, M., Trejo, L. J., & Cichocki, A. (2001). Kernel PCA for feature extraction and de-noising in nonlinear regression. Neural Computing and Applications, 10, 231–243.

    Article  Google Scholar 

  • Scholkopf, B., Smola, A., Muller, K.R. (1996). Nonlinear component analysis as a kernel eigenvalue problem. Max-Planck-Institut für biologische Kybernetik Spemannstra Germany, Technical Report No.44.

  • Shaghaghian, T. (2010). Prediction of dissolved oxygen in rivers using a Wang–Mendel method—case study of Au Sable River. World Academy of Science, Engineering and Technology, 38, 795–802.

    Google Scholar 

  • Singh, K. P., Malik, A., Mohan, D., & Sinha, S. (2004). Multivariate statistical techniques for the evaluation of spatial and temporal variations in water quality of Gomti River (India)–a case study. Water Research, 38, 3980–3992.

    Article  CAS  Google Scholar 

  • Singh, K. P., Malik, A., & Singh, V. K. (2006). Chemometric analysis of hydro-chemical data of an alluvial river—a case study. Water, Air, & Soil Pollution, 170, 383–404.

    Article  CAS  Google Scholar 

  • Singh, K. P., Basant, A., Malik, A., & Jain, G. (2009). Artificial neural network modeling of the river water quality—a case study. Ecological Modeling, 220, 888–895.

    Article  CAS  Google Scholar 

  • Singh, K. P., Basant, N., Malik, A., & Jain, G. (2010). Modeling the performance of “up-flow anaerobic sludge blanket” reactor based wastewater treatment plant using linear and nonlinear approaches—a case study. Analytica Chimica Acta, 658, 1–11.

    Article  CAS  Google Scholar 

  • Singh, K. P., Basant, N., & Gupta, S. (2011). Support vector machines in water quality management. Analytica Chimica Acta, 703, 152–162.

    Article  CAS  Google Scholar 

  • Singh, K. P., Gupta, S., Kumar, A., & Shukla, S. P. (2012). Linear and nonlinear modeling approaches for urban air quality prediction. Science of the Total Environment, 426, 244–255.

    Article  CAS  Google Scholar 

  • Singh, K. P., Gupta, S., & Rai, P. (2013). Identifying pollution sources and predicting urban air quality using ensemble learning methods. Atmospheric Environment, 80, 426–437.

    Article  CAS  Google Scholar 

  • Taylor, J.S, Cristianini, N. (2004). Kernel method for pattern analysis. Cambridge, Cambridge University Press

  • Thomann, R. V., & Mueller, J. A. (1987). Principles of surface water quality modeling and control. New York: Harper Collins Publishers.

    Google Scholar 

  • Ustun, B., Melssen, W. J., Oudenhuijzen, M., & Buydens, L. M. C. (2005). Determination of optimal support vector regression parameters by genetic algorithms and simplex optimization. Analytica Chimica Acta, 544, 292–305.

    Article  CAS  Google Scholar 

  • Vapnik, V. (1999). The nature of statistical learning theory (2nd ed.). Berlin: Springer.

    Google Scholar 

  • Wang, W., Xu, Z., Lu, W., & Zhang, X. (2003). Determination of the spread parameter in the Gaussian kernel for classification and regression. Neurocomputing, 55, 643–663.

    Article  Google Scholar 

  • Wang, J., Du, H., Liu, H., Yao, X., Hu, Z., & Fan, B. (2007). Prediction of surface tension for common compounds based on novel methods using heuristic method and support vector machine. Talanta, 73, 147–156.

    Article  CAS  Google Scholar 

  • Wen, X., Fang, J., Diao, M., & Zhang, C. (2013). Artificial neural network modeling of dissolved oxygen in the Heihe River, Northwestern China. Environmental Monitoring and Assessment, 185(5), 4361–4371.

    Article  CAS  Google Scholar 

  • Woo, S. H., Jeon, C. O., Yun, Y. S., Choi, H., Lee, C. S., & Lee, D. S. (2009). On-line estimation of key process variables based on kernel partial least squares in an industrial cokes wastewater treatment plant. Journal of Hazardous Materials, 161, 538–544.

    Article  CAS  Google Scholar 

  • Zhang, Y., & Ma, C. (2011). Fault diagnosis of nonlinear processes using multiscale KPCA and multiscale KPLS. Chemical Engineering Science, 66, 64–72.

    Article  CAS  Google Scholar 

  • Zhang, Y., & Teng, Y. (2010). Process data modeling using modified kernel partial least squares. Chemical Engineering Science, 65, 6353–6361.

    Article  CAS  Google Scholar 

  • Zhang, P., Lee, C., Verweij, H., Akbar, S. A., Hunter, G., & Dutta, P. K. (2007). High temperature sensor array for simultaneous determination of O2, CO, and CO2 with kernel ridge regression data analysis. Sensors and Actuators B: Chemical, 123, 950–963.

    Article  CAS  Google Scholar 

  • Zhang, W., Tang, S. Y., Zhu, Y. F., & Wang, W. P. (2010). Comparative studies of support vector regression between reproducing kernel and Gaussian kernel. World Academy of Science, Engineering and Technology, 65, 933–941.

    Google Scholar 

Download references

Acknowledgements

The authors thank the Director, CSIR-Indian Institute of Toxicology Research, Lucknow, for his keen interest in this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kunwar P. Singh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Singh, K.P., Gupta, S. & Rai, P. Predicting dissolved oxygen concentration using kernel regression modeling approaches with nonlinear hydro-chemical data. Environ Monit Assess 186, 2749–2765 (2014). https://doi.org/10.1007/s10661-013-3576-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10661-013-3576-6

Keywords

Navigation