Skip to main content
Log in

GTest: a software tool for graphical assessment of empirical distributions’ Gaussianity

  • Published:
Environmental Monitoring and Assessment Aims and scope Submit manuscript

Abstract

In the present paper, the novel software GTest is introduced, designed for testing the normality of a user-specified empirical distribution. It has been implemented with two unusual characteristics; the first is the user option of selecting four different versions of the normality test, each of them suited to be applied to a specific dataset or goal, and the second is the inferential paradigm that informs the output of such tests: it is basically graphical and intrinsically self-explanatory. The concept of inference-by-eye is an emerging inferential approach which will find a successful application in the near future due to the growing need of widening the audience of users of statistical methods to people with informal statistical skills. For instance, the latest European regulation concerning environmental issues introduced strict protocols for data handling (data quality assurance, outliers detection, etc.) and information exchange (areal statistics, trend detection, etc.) between regional and central environmental agencies. Therefore, more and more frequently, laboratory and field technicians will be requested to utilize complex software applications for subjecting data coming from monitoring, surveying or laboratory activities to specific statistical analyses. Unfortunately, inferential statistics, which actually influence the decisional processes for the correct managing of environmental resources, are often implemented in a way which expresses its outcomes in a numerical form with brief comments in a strict statistical jargon (degrees of freedom, level of significance, accepted/rejected H0, etc.). Therefore, often, the interpretation of such outcomes is really difficult for people with poor statistical knowledge. In such framework, the paradigm of the visual inference can contribute to fill in such gap, providing outcomes in self-explanatory graphical forms with a brief comment in the common language. Actually, the difficulties experienced by colleagues and their request for an effective tool for addressing such difficulties motivated us in adopting the inference-by-eye paradigm and implementing an easy-to-use, quick and reliable statistical tool. GTest visualizes its outcomes as a modified version of the Q-Q plot. The application has been developed in Visual Basic for Applications (VBA) within MS Excel 2010, which demonstrated to have all the characteristics of robustness and reliability needed. GTest provides true graphical normality tests which are as reliable as any statistical quantitative approach but much easier to understand. The Q-Q plots have been integrated with the outlining of an acceptance region around the representation of the theoretical distribution, defined in accordance with the alpha level of significance and the data sample size. The test decision rule is the following: if the empirical scatterplot falls completely within the acceptance region, then it can be concluded that the empirical distribution fits the theoretical one at the given alpha level. A comprehensive case study has been carried out with simulated and real-world data in order to check the robustness and reliability of the software.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Barca, E., & Passarella, G. (2008). Spatial evaluation of the risk of groundwater quality degradation: a comparison between disjunctive kriging and geostatistical simulation. Environmental Monitoring and Assessment, 137(1–3), 261–273.

    Article  CAS  Google Scholar 

  • Barca, E., Passarella, G., & Uricchio, V. F. (2008). Optimal extension of the rain gauge monitoring network of the Apulian regional consortium for agricultural defense. Environmental Monitoring and Assessment, 145(1–3), 375–386.

    Article  CAS  Google Scholar 

  • Beaulieu-Prevost, D. (2006). Confidence intervals: from tests of statistical significance to confidence intervals, range hypotheses and substantial effects. Tutorial in Quantitative Methods for Psychology, 2(1), 11–19.

    Google Scholar 

  • Calzada M. E., Scariano S. M. (2002).Visual EDF software to check the normality assumption. Electronic Proceedings of the Fifteenth Annual International Conference on Technology in Collegiate Mathematics. Orlando, Florida, 31 October – 3 November 2002, Paper C022.

  • Castrignanò, A., De Benedetto, D., Girone, G., Guastaferro, F., & Sollitto, D. (2010). Characterization, delineation and visualization of agro-ecozones using multivariate geographical clustering. Italian Journal of Agronomy, 5, 121–132.

    Article  Google Scholar 

  • Conover, W. J. (1980). Practical nonparametric statistics (2nd ed.). New York: John Wiley and Sons.

    Google Scholar 

  • D’Agostino R., Stephens M. (1986). Goodness-of-fit techniques. Marcel Decker

  • Devaney J. (1997), Equation discovery through global self-referenced geometric intervals and machine learning. Ph.D thesis, George Mason University, Fairfax, VA.

  • Diggle P. J., Ribeiro P. J. Jr (2007). Model-based geostatistics. Springer Series in Statistics

  • Filliben, J. J. (1975). The probability plot correlation coefficient test for normality. Technometrics (American Society for Quality), 17(1), 111–117.

    Google Scholar 

  • Glantz S. (2005) Primer of biostatistics. McGraw-Hill (6 ed).

  • Gnanadesikan, R., & Wilk, M. B. (1968). Probability plotting methods for the analysis of data. Biometrika, 55(1), 1–17.

    Google Scholar 

  • Greene, W. H. (2000). Econometric analysis (4th ed.). Upper Saddle River: Prentice Hall.

    Google Scholar 

  • Hazen, A. (1930). Flood flows. A study of frequencies and magnitudes. New York: Wiley.

    Google Scholar 

  • Hogg, R. V., & Tanis, E. A. (1977). Probability and statistical inference. New York: MacMillan Publishing.

    Google Scholar 

  • Hollander, M., & Wolfe, D. A. (1999). Nonparametric statistical methods (2nd ed.). New York: Wiley.

    Google Scholar 

  • Keeling, K. B., & Pavur, R. J. (2011). Statistical accuracy of spreadsheet software. The American Statistician, 65(4), 265–273.

    Article  Google Scholar 

  • Looney, S. W., & Gulledge, T. R., Jr. (1985). Use of the correlation coefficient with normal probability plots. The American Statistician, 39(1), 75–79.

    Google Scholar 

  • Masciale, R., Barca, E., & Passarella, G. (2011). A methodology for rapid assessment of the environmental status of the shallow aquifer of "Tavoliere di Puglia" (Southern Italy). Environmental Monitoring and Assessment, 177(1–4), 245–261.

    Article  Google Scholar 

  • Mazen A., Magid M., Hemmasi M., Lewis M. F. (1985). In search of power: a statistical power analysis of contemporary research in strategic management. Academy of Management Proceedings, 30–34.

  • Michael, J. R. (1983). The stabilized probability plot. Biometrika, 70(1), 11–17.

    Article  Google Scholar 

  • Nash, J. C. (2006). Spreadsheets in statistical practice—another look. The American Statistician, 60(3), 207–289.

  • Ott W. R. (1995). Environmental statistics and data analysis. Lewis Publishers

  • Razali, N. M., & Wah, Y. B. (2011). Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21–33.

    Google Scholar 

  • Reinard J. C. (2006). Communication research statistics. Sage Publications

  • Rochowicz, J. A., Jr. (2010). Bootstrapping analysis, inferential statistics and EXCEL. Spreadsheets in Education (eJSiE), 4(3), 1–23.

    Google Scholar 

  • Royston, P. (1993). Graphical detection of non-normality by using Michael’s statistic. Journal of the Royal Statistical Society: Series C: Applied Statistics, 42(1), 153–158.

    Google Scholar 

  • Steinskog, D. J., Tjøstheim, D. B., & Kvamstø, N. G. (2007). A cautionary note on the use of the Kolmogorov-Smirnov test for normality. American Meteorological Society, 135(3), 1151–1157. doi:10.1175/MWR3326.1.

    Google Scholar 

  • Stirling W. D. (1982). Enhancements to aid interpretation of probability plots. The Statistician, 31(3)

  • Sutherland, W. J., Spiegelhalter, D., & Burgman, M. (2013). Policy: twenty tips for interpreting scientific claims. Nature, 503, 335–337. doi:10.1038/503335a.

    Article  Google Scholar 

  • Thode, H. C., Jr. (2002). Testing for normality. New York: Marcel Dekker. ISBN 0-8247-9613-6.

    Book  Google Scholar 

  • Wheater, C. P., & Cook, P. A. (2000). Using statistics to understand the environment. Introductions to Environment Series (1st ed.). London: Routledge. 246 p. ISBN 0-415-19887-9.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to E. Barca.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Barca, E., Bruno, E., Bruno, D.E. et al. GTest: a software tool for graphical assessment of empirical distributions’ Gaussianity. Environ Monit Assess 188, 138 (2016). https://doi.org/10.1007/s10661-016-5138-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10661-016-5138-1

Keywords

Navigation