Environmental Modeling & Assessment

, Volume 23, Issue 6, pp 779–786 | Cite as

Robust Regression with Data-Dependent Regularization Parameters and Autoregressive Temporal Correlations

  • Na Wang
  • You-Gan WangEmail author
  • Shuwen Hu
  • Zhi-Hua Hu
  • Jing Xu
  • Hongwu Tang
  • Guangqiu Jin


We introduce robust procedures for analyzing water quality data collected over time. One challenging task in analyzing such data is how to achieve robustness in presence of outliers while maintaining high estimation efficiency so that we can draw valid conclusions and provide useful advices in water management. The robust approach requires specification of a loss function such as the Huber, Tukey’s bisquare and the exponential loss function, and an associated tuning parameter determining the extent of robustness needed. High robustness is at the cost of efficiency loss in parameter loss. To this end, we propose a data-driven method which leads to more efficient parameter estimation. This data-dependent approach allows us to choose a regularization (tuning) parameter that depends on the proportion of “outliers” in the data so that estimation efficiency is maximized. We illustrate the proposed methods using a study on ammonium nitrogen concentrations from two sites in the Huaihe River in China, where the interest is in quantifying the trend in the most recent years while accounting for possible temporal correlations and “irregular” observations in earlier years.


Ammonia nitrogen Regularization Log-linear model Model selection Robust estimation Temporal correlations 


Funding Information

This research was funded by the Australian Research Council projects (DP130100766 and DP160104292).


  1. 1.
    Birkes, D., & Dodge, Y. (1993). Alternative methods of regression. New York: Wiley.CrossRefGoogle Scholar
  2. 2.
    Venables, W.N., & Ripley, B.D. (2002). Modern applied statistics with S-PLUS, 4th Edn. Springer.Google Scholar
  3. 3.
    Wang, Y.-G., Lin, X., Zhu, M., Bai, Z. (2007). Robust estimation using the Huber function with a data-dependent tuning constant. Journal of Computational and Graphical Statistics, 16(2), 468–481.CrossRefGoogle Scholar
  4. 4.
    Wang, X., Jiang, Y., Huang, M., Zhang, H. (2013). Robust variable selection with exponential squared loss. Journal of the American Statistical Association, 108, 632–643.CrossRefGoogle Scholar
  5. 5.
    Huber, P.J. (1964). Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35, 73–101.CrossRefGoogle Scholar
  6. 6.
    Schrader, R.M., & Hettmansperger, T.P. (1980). Robust analysis of variance based on upon a likelihood ratio criterion. Biometrika, 67, 93–101.CrossRefGoogle Scholar
  7. 7.
    Wang, J.Z., Chen, T.H., Zhu, C.Z., Peng, S.C. (2014). Trace organic pollutants in sediments from Huaihe River, China: evaluation of sources and ecological risk. Journal of Hydrology, 512, 463–469. Scholar
  8. 8.
    He, T., Lu, Y., Cui, Y., Luo, Y., Wang, M., Meng, W., Zhang, K., Zhao, F. (2015). Detecting gradual and abrupt changes in water quality time series in response to regional payment programs for watershed services in an agricultural area. Journal of Hydrology, 525, 457–471. Scholar
  9. 9.
    Tian, D., Zheng, W., Wei, X., Sun, X., Liu, L., Chen, X., Zhang, H., Zhou, Y., Chen, H., Zhang, H., Wang, X., Zhang, R., Jiang, S., Zheng, Y., Yang, G., Qu, W. (2013). Dissolved microcystins in surface and ground waters in regions with high cancer incidence in the Huai River Basin of China. Chemosphere, 91(7), 1064–71. Scholar
  10. 10.
    Zhang, J.Y., Wang, G.Q., Pagano, T.C., Jin, J.L., Liu, C.S., He, R.M., Liu, Y.L. (2013). Using hydrologic simulation to explore the impacts of climate change on runoff in the Huaihe River Basin of China. Journal of Hydrologic Engineering, 18(11), 1393–1399. Scholar
  11. 11.
    Wang, Y.-G., Kuhnert, P., Henderson, B. (2011). Load estimation with uncertainties from opportunistic sampling data—a semiparametric approach. Journal of Hydrology, 396(1), 148–157.CrossRefGoogle Scholar
  12. 12.
    Wang, Y.-G., & Tian, T. (2013). Sediment concentration prediction and statistical evaluation for annual load estimation. Journal of Hydrology, 482, 69–78.CrossRefGoogle Scholar
  13. 13.
    Wang, Y.-G., & Lin, X. (2005). Effects of variance-function misspecification in analysis of longitudinal data. Biometrics, 61(2), 413–421.CrossRefGoogle Scholar
  14. 14.
    Wang, Y.-G., & Zhao, Y. (2007). A modified pseudolikelihood approach for analysis of longitudinal data. Biometrics, 63(3), 681–689.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Mathematical SciencesThe University of Queensland of TechnologyBrisbaneAustralia
  2. 2.Logistics Research CenterShanghai Maritime UniversityShanghaiChina
  3. 3.State Key Laboratory of Hydrology-Water Resource and Hydraulic EngineeringHohai UniversityNanjingChina

Personalised recommendations