Skip to main content
Log in

Estimating Percentiles of Bacteriological Counts of Recreational Water Quality Using Tweedie Models

  • Original Paper
  • Published:
Water Quality, Exposure and Health Aims and scope Submit manuscript

Abstract

There are general guidelines and standards for measuring the microbial quality of water to prevent the incidence of disease outbreaks. Many agencies have chosen the 95th percentile; one can assess the recreational water quality, depending if the percentile value exceeds the guideline value or not. It is well known that this kind of data do not display a normal distribution and several alternatives have been proposed and are in use for estimating the percentile. A review of existing methods is given, that includes non parametric estimators as Hazen, Blom, Tukey and Weibull. We also describe transformations such as logarithmic and Box–Cox, that generate near normal data, after obtaining the normal percentile the inverse transformation is applied to obtain estimators in the original scale. A new methodology is proposed, consisting in finding the Tweedie distribution that better fits the observed data; this family has nonnegative support and can have a discrete mass at zero, making it useful to model skewed data that are a mixture of zeros and positive values. It allows working with parametric models in the original scale. We performed a Monte Carlo simulation to compare the performance of all the percentiles described above. As a result we noted that the percentile calculated from Tweedie distribution has lower mean square error than the others, which makes it the more precise estimator. All these techniques were applied to four data sets and, in all cases the Tweedie estimator was closer to the observed values than non parametric and anti transformed estimators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Bartram J, Rees G (2000) Monitoring bathing waters. E and FN Spon, London

  • Beamonte E, Bermúdez JD, Casino A, Veres E (2007) A statistical study of the quality of surface water intended for human consumption near Valencia (Spain). J Environ Manage 83(3):307–314 (ISSN 0301–4797). http://dx.doi.org/10.1016/j.jenvman.2006.03.010

  • Box GE, Cox DR (1964) An analysis of transformed data. J R Stat Soc B 39:211–252

    Google Scholar 

  • Chawla R, Hunter PR (2005) Classification of bathing water quality based on the parametric calculation of percentiles is unsound. Water Res 39(18): 4552–4558 (ISSN 0043–1354). http://dx.doi.org/10.1016/j.watres.2005.08.022

  • Crabtree RW, Cluckie ID, Forster CF (1987) Percentile estimation for water quality data. Water Res 21(5):583–590 (ISSN 0043–1354). http://dx.doi.org/10.1016/0043-1354(87)90067-4

  • Dunn PK, Smyth GK (2005) Series evaluation of Tweedie exponential dispersion model densities. Stat Comput 15(4):267–280

    Article  Google Scholar 

  • Dunn P (2004) Tweedie exponential family models. R package version1.02. http://www.r-project.org/

  • Ellis JC (1989) Handbook on the design and interpretation of monitoring programmes. Report NS 29: Medmenham, England: WRc Environment, Water. Research Centre

  • Feller W (1978) Introducción a la Teoría de Probabilidades y sus Aplicaciones, vol II. Limusa

  • Freeman J, Modarres R (2006) Inverse Box–Cox: the power-normal distribution. Stat Probab Lett 76:764–772

    Article  Google Scholar 

  • Hunter PR (2002) Does calculation of the 95th percentile of microbiological results offer any advantage over percentage exceedence in determining compliance with bathing water quality standards? Lett Appl Microbiol 34(4):283–286

    Article  CAS  Google Scholar 

  • Jørgensen B (1992) The theory of exponential dispersion models and analysis of deviance. Mathematical Monographs no 51. IMPA, Rio de Janeiro, Brasil

  • Jørgensen B (1997) The theory of dispersion models. Chapman and Hall, Boca Raton

    Google Scholar 

  • Modarres R, Nayak TK, Gastwirth JL (2002) Estimation of upper quantiles under model and parameter uncertainty. Comput Stat Data Anal 39:529–554

    Article  Google Scholar 

  • R Development Core Team (2006) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria. http://www.r-project.org/

  • Taylor JM (1985) Measures of location of skew distributins obtained through Box–Cox transformations. Am Stat Assoc 80(390):427–432

    Article  Google Scholar 

  • Tweedie MCK (1984) An index which distinguishes between some important exponential families. In Ghosh JK, Roy J (eds) Statistics: applications and new directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference, pp 579–604. Indian Statistical Institute, Calcutta

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria Laura Patat.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Patat, M.L., Ricci, L., Comino, A.P. et al. Estimating Percentiles of Bacteriological Counts of Recreational Water Quality Using Tweedie Models. Water Qual Expo Health 7, 227–231 (2015). https://doi.org/10.1007/s12403-014-0143-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12403-014-0143-5

Keywords

Navigation