GTest: a software tool for graphical assessment of empirical distributions’ Gaussianity

Barca, E.; Bruno, E.; Bruno, D. E.; Passarella, G.

doi:10.1007/s10661-016-5138-1

GTest: a software tool for graphical assessment of empirical distributions’ Gaussianity

Published: 03 February 2016

Volume 188, article number 138, (2016)
Cite this article

Environmental Monitoring and Assessment Aims and scope Submit manuscript

E. Barca¹,
E. Bruno¹,
D. E. Bruno¹ &
…
G. Passarella¹

280 Accesses
8 Citations
Explore all metrics

Abstract

In the present paper, the novel software GTest is introduced, designed for testing the normality of a user-specified empirical distribution. It has been implemented with two unusual characteristics; the first is the user option of selecting four different versions of the normality test, each of them suited to be applied to a specific dataset or goal, and the second is the inferential paradigm that informs the output of such tests: it is basically graphical and intrinsically self-explanatory. The concept of inference-by-eye is an emerging inferential approach which will find a successful application in the near future due to the growing need of widening the audience of users of statistical methods to people with informal statistical skills. For instance, the latest European regulation concerning environmental issues introduced strict protocols for data handling (data quality assurance, outliers detection, etc.) and information exchange (areal statistics, trend detection, etc.) between regional and central environmental agencies. Therefore, more and more frequently, laboratory and field technicians will be requested to utilize complex software applications for subjecting data coming from monitoring, surveying or laboratory activities to specific statistical analyses. Unfortunately, inferential statistics, which actually influence the decisional processes for the correct managing of environmental resources, are often implemented in a way which expresses its outcomes in a numerical form with brief comments in a strict statistical jargon (degrees of freedom, level of significance, accepted/rejected H₀, etc.). Therefore, often, the interpretation of such outcomes is really difficult for people with poor statistical knowledge. In such framework, the paradigm of the visual inference can contribute to fill in such gap, providing outcomes in self-explanatory graphical forms with a brief comment in the common language. Actually, the difficulties experienced by colleagues and their request for an effective tool for addressing such difficulties motivated us in adopting the inference-by-eye paradigm and implementing an easy-to-use, quick and reliable statistical tool. GTest visualizes its outcomes as a modified version of the Q-Q plot. The application has been developed in Visual Basic for Applications (VBA) within MS Excel 2010, which demonstrated to have all the characteristics of robustness and reliability needed. GTest provides true graphical normality tests which are as reliable as any statistical quantitative approach but much easier to understand. The Q-Q plots have been integrated with the outlining of an acceptance region around the representation of the theoretical distribution, defined in accordance with the alpha level of significance and the data sample size. The test decision rule is the following: if the empirical scatterplot falls completely within the acceptance region, then it can be concluded that the empirical distribution fits the theoretical one at the given alpha level. A comprehensive case study has been carried out with simulated and real-world data in order to check the robustness and reliability of the software.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Article 07 February 2017

Multivariate Gaussian processes: definitions, examples and applications

Article Open access 27 January 2023

References

Barca, E., & Passarella, G. (2008). Spatial evaluation of the risk of groundwater quality degradation: a comparison between disjunctive kriging and geostatistical simulation. Environmental Monitoring and Assessment, 137(1–3), 261–273.
Article CAS Google Scholar
Barca, E., Passarella, G., & Uricchio, V. F. (2008). Optimal extension of the rain gauge monitoring network of the Apulian regional consortium for agricultural defense. Environmental Monitoring and Assessment, 145(1–3), 375–386.
Article CAS Google Scholar
Beaulieu-Prevost, D. (2006). Confidence intervals: from tests of statistical significance to confidence intervals, range hypotheses and substantial effects. Tutorial in Quantitative Methods for Psychology, 2(1), 11–19.
Google Scholar
Calzada M. E., Scariano S. M. (2002).Visual EDF software to check the normality assumption. Electronic Proceedings of the Fifteenth Annual International Conference on Technology in Collegiate Mathematics. Orlando, Florida, 31 October – 3 November 2002, Paper C022.
Castrignanò, A., De Benedetto, D., Girone, G., Guastaferro, F., & Sollitto, D. (2010). Characterization, delineation and visualization of agro-ecozones using multivariate geographical clustering. Italian Journal of Agronomy, 5, 121–132.
Article Google Scholar
Conover, W. J. (1980). Practical nonparametric statistics (2nd ed.). New York: John Wiley and Sons.
Google Scholar
D’Agostino R., Stephens M. (1986). Goodness-of-fit techniques. Marcel Decker
Devaney J. (1997), Equation discovery through global self-referenced geometric intervals and machine learning. Ph.D thesis, George Mason University, Fairfax, VA.
Diggle P. J., Ribeiro P. J. Jr (2007). Model-based geostatistics. Springer Series in Statistics
Filliben, J. J. (1975). The probability plot correlation coefficient test for normality. Technometrics (American Society for Quality), 17(1), 111–117.
Google Scholar
Glantz S. (2005) Primer of biostatistics. McGraw-Hill (6 ed).
Gnanadesikan, R., & Wilk, M. B. (1968). Probability plotting methods for the analysis of data. Biometrika, 55(1), 1–17.
Google Scholar
Greene, W. H. (2000). Econometric analysis (4th ed.). Upper Saddle River: Prentice Hall.
Google Scholar
Hazen, A. (1930). Flood flows. A study of frequencies and magnitudes. New York: Wiley.
Google Scholar
Hogg, R. V., & Tanis, E. A. (1977). Probability and statistical inference. New York: MacMillan Publishing.
Google Scholar
Hollander, M., & Wolfe, D. A. (1999). Nonparametric statistical methods (2nd ed.). New York: Wiley.
Google Scholar
Keeling, K. B., & Pavur, R. J. (2011). Statistical accuracy of spreadsheet software. The American Statistician, 65(4), 265–273.
Article Google Scholar
Looney, S. W., & Gulledge, T. R., Jr. (1985). Use of the correlation coefficient with normal probability plots. The American Statistician, 39(1), 75–79.
Google Scholar
Masciale, R., Barca, E., & Passarella, G. (2011). A methodology for rapid assessment of the environmental status of the shallow aquifer of "Tavoliere di Puglia" (Southern Italy). Environmental Monitoring and Assessment, 177(1–4), 245–261.
Article Google Scholar
Mazen A., Magid M., Hemmasi M., Lewis M. F. (1985). In search of power: a statistical power analysis of contemporary research in strategic management. Academy of Management Proceedings, 30–34.
Michael, J. R. (1983). The stabilized probability plot. Biometrika, 70(1), 11–17.
Article Google Scholar
Nash, J. C. (2006). Spreadsheets in statistical practice—another look. The American Statistician, 60(3), 207–289.
Ott W. R. (1995). Environmental statistics and data analysis. Lewis Publishers
Razali, N. M., & Wah, Y. B. (2011). Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21–33.
Google Scholar
Reinard J. C. (2006). Communication research statistics. Sage Publications
Rochowicz, J. A., Jr. (2010). Bootstrapping analysis, inferential statistics and EXCEL. Spreadsheets in Education (eJSiE), 4(3), 1–23.
Google Scholar
Royston, P. (1993). Graphical detection of non-normality by using Michael’s statistic. Journal of the Royal Statistical Society: Series C: Applied Statistics, 42(1), 153–158.
Google Scholar
Steinskog, D. J., Tjøstheim, D. B., & Kvamstø, N. G. (2007). A cautionary note on the use of the Kolmogorov-Smirnov test for normality. American Meteorological Society, 135(3), 1151–1157. doi:10.1175/MWR3326.1.
Google Scholar
Stirling W. D. (1982). Enhancements to aid interpretation of probability plots. The Statistician, 31(3)
Sutherland, W. J., Spiegelhalter, D., & Burgman, M. (2013). Policy: twenty tips for interpreting scientific claims. Nature, 503, 335–337. doi:10.1038/503335a.
Article Google Scholar
Thode, H. C., Jr. (2002). Testing for normality. New York: Marcel Dekker. ISBN 0-8247-9613-6.
Book Google Scholar
Wheater, C. P., & Cook, P. A. (2000). Using statistics to understand the environment. Introductions to Environment Series (1st ed.). London: Routledge. 246 p. ISBN 0-415-19887-9.
Google Scholar

Download references

Author information

Authors and Affiliations

Water Research Institute, National Research Council, Viale De Blasio, 5-70125, Bari, Italy
E. Barca, E. Bruno, D. E. Bruno & G. Passarella

Authors

E. Barca
View author publications
You can also search for this author in PubMed Google Scholar
E. Bruno
View author publications
You can also search for this author in PubMed Google Scholar
D. E. Bruno
View author publications
You can also search for this author in PubMed Google Scholar
G. Passarella
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to E. Barca.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barca, E., Bruno, E., Bruno, D.E. et al. GTest: a software tool for graphical assessment of empirical distributions’ Gaussianity. Environ Monit Assess 188, 138 (2016). https://doi.org/10.1007/s10661-016-5138-1

Download citation

Received: 24 June 2015
Accepted: 26 January 2016
Published: 03 February 2016
DOI: https://doi.org/10.1007/s10661-016-5138-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GTest: a software tool for graphical assessment of empirical distributions’ Gaussianity

Abstract

Access this article

Similar content being viewed by others

Violating the normality assumption may be the lesser of two evils

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Multivariate Gaussian processes: definitions, examples and applications

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

GTest: a software tool for graphical assessment of empirical distributions’ Gaussianity

Abstract

Access this article

Similar content being viewed by others

Violating the normality assumption may be the lesser of two evils

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Multivariate Gaussian processes: definitions, examples and applications

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation