Abstract—
Statistical analysis is an integral part of every experiment, since it helps researchers to make conclusions out of the work done. Textbooks with detailed description of the algorithms for statistical analysis exist, as well as diverse software packages. Statistical methods are, however, often applied incorrectly, which leads to erroneous and inadequate conclusions. The present article is an attempt to determine the main problems emerging in the course of statistical analysis of experimental results in the field of environmental microbiology. For instance, the classical parametric tests, which are most often used in experimental articles (t-test, analysis of variance, Pearson correlation coefficient, etc.), are applicable only when a random variable of n independent observations has normal distribution. Importantly, the normality of the distribution must be proved, and it is possible to do only for minimal sample size of 20‒30 independent observations per group. The latter is crucial for the incubation, isotope, and molecular biological experiments. The family of normal distributions is not the only family of parameter-dependent (parametric) distributions of random variables. Moreover, real-life distributions usually differ from the normal ones, and normal distribution may be considered only as a certain approximation. High diversity of parametric families complicates the choice of criteria and statistical tests for analysis of a set of data collected from n independent observations. Nonparametric statistics may be of help, since it is free of sample size and distribution requirements, although, as any method of analysis, it also has limitations. An increasing number of experimental data, including those in the field of environmental microbiology, are nowadays analyzed by using nonparametric statistics, which indicates a certain tendency for substitution of parametric methods by nonparametric ones.
Similar content being viewed by others
Notes
Parametric statistics deals with sample data that follows a probability distribution described by analytical formula with a small number of parameters (these distributions are termed parametric). Distribution (of probabilities) is a function determining the probability that a random variable accepts a given value or falls within a given interval. In parametric estimation tasks, a probabilistic model is accepted, according to which observations x1, x2,…, xn are interpreted as realization of n independent random variables with the distribution function F(x; θ). Here θ is an unknown vector parameter (of a fixed final dimension, including dimension 1, when θ is a single number) in the parameter space Θ, which is assigned by the probabilistic model used. The goal of estimation is to determine the point estimate and confidence intervals (or confidence range) for the components of the θ vector parameter. Nonparametric statistical methods are not based on the suggestion of the distribution functions belonging to a given parametric family, i.e., they do not require the distribution function to be given as a specific analytical formula (Orlov, 2004).
A sample is a number of independent similarly distributed random elements, i.e., a set of observed values (x1, x2, …, xn) or a set of objects taken from a studied totality, where n (sample size) is the number of observed values in the sample (Orlov, 2004).
REFERENCES
Belova, S.E., Oshkin, I.Yu., Glagolev, M.V., Lapshina, E.D., Maksyutov, Sh.Sh., and Dedysh, S.N., Methanotrophic bacteria in cold seeps of the floodplains of northern rivers, Microbiology (Moscow), 2013, vol. 82, pp. 743–750.
Bohn, T.J., Podest, E., Schroeder, R., Pinto, N., McDonald, K.C., Glagolev, M., Filippov, I., Maksyutov, S., Heimann, M., Chen, X., and Lettenmaier, D.P., Modelling the large-scale effects of surface moisture heterogeneity on wetland carbon fluxes in the West Siberian Lowland, Biogeosci., 2013, vol. 10, pp. 6559‒6576.
Born, M., Dörr, H., and Levin, I., Methane consumption in aerated soils of the temperate zone, Tellus, 1990, vol. 42B, pp. 2‒8.
Borovkov, A.A., Matematicheskaya statistika (Mathematical Statistics), S.-Pb., Lan’, 2009.
Bostandzhiyan, V.A., Posobie po statisticheskim raspredeleniyam (Manual on Statistical Distributions), Chernogolovka: IPKhF RAS, 2013.
Cao, M., Marshall, S., and Gregson, K., Global carbon exchange and methane emissions from natural wetlands: application of a process-based model, J. Geophys. Res., 1996, vol. 101, no. D9, pp. 14399‒14414.
Corder, G.W. and Foreman, D.I., Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach, Hoboken: Wiley, 2009.
Dmitriev, E.A., Matematicheskaya statistika v pochvovedenii (Mathematical Statistics in Soil Science), Moscow: Mos. Gos. Univ., 1995.
Dörr, H., Katruff, L., and Levin, I., Soil texture parameterization of the methane uptake in aerated soils, Chemosphere, 1993, vol. 26, pp. 697‒713.
Erceg-Hurn, D.M. and Mirosevich, V.M., Modern robust statistical methods an easy way to maximize the accuracy and power of your research, Am. Psychologist, 2008, vol. 63, pp. 591–601.
Glagolev, M., Kleptsova, I., Filippov, I., Maksyutov, S., and Machida, T., Regional methane emission from West Siberia mire landscapes, Environ. Res. Lett., 2011, vol. 6, no. 4. 045214.
Glagolev, M.V. and Filippov, I.V., Inventory of soil methane consumption, Environ. Dynamics Global Climate Change, 2011, vol. 2, no. 2(4). EDCCrev0002.
Glagolev, M.V. and Smagin, A.V., Quantitative assessment of methane emission from bogs: from soil profile to region (to 15th anniversary of researchn in Tomsk oblast), Dokl. Ekol. Pochvoved., 2006, vol. 3, no. 3, pp. 75‒114.
Glagolev, M.V., Golovatskaya, E.A., and Shnyrev, N.A., Greenhouse gas emission in West Siberia, Contemp. Probl. Ecol., 2008, vol. 1, pp. 136‒146.
Glagolev, M.V., Sabrekov, A.F., and Terentieva, I.E., Reply to A.V. Smagin: IV. Surface diffusion or random noise?, Environ. Dynamics Global Climate Change, 2017, vol. 8, no. 1, pp. 55‒65.
Glagolev, M.V., Sabrekov, A.F., Kleptsova, I.E., Filippov, I.V., Lapshina, E.D., Machida, T., and Maksyutov, Sh.Sh., Methane emission from bogs in the subtaiga of Western Siberia: the development of standard model, Euras. Soil Sci., 2012, vol. 45, pp. 947‒957.
Kobzar’, A.I., Prikladnaya matematicheskaya statistika. Dlya inzhenerov i nauchnykh rabotnikov (Applied Mathematical Statistics. For Engineers abd Scientists), Moscow: FIZMATLIT, 2006.
Nahm, F.S., Nonparametric statistical tests for the continuous data: the basic concept and the practical use, Korean J. Anesthesiol., 2016, vol. 69, pp. 8‒14.
Orlov, A.I., Prikladnaya statistika (Applied Statistics), Moscow: Ekzamen, 2004.
Oshkin, I.Y., Wegner, C.-E., Lüke, C., Glagolev, M.V., Filippov, I.V., Pimenov, N.V., Liesack, W., and Dedysh, S.N., Gammaproteobacterial methanotrophs dominate cold methane seeps in floodplains of West Siberian rivers, Appl. Environ. Microbiol., 2014, vol. 80, pp. 5944‒5954.
Panikov, N.S., Taiga bogs—a global source of atmospheric methane?, Priroda, 1995, no. 6, pp. 14‒25.
Pett, M.A., Nonparametric Statistics in Health Care Research: Statistics for Small Samples and Unusual Distributions, Thousand Oaks: SAGE Publications, 1997.
Potter, C.S., Davidson, E.A., and Verchot, L.V., Estimation of global biogeochemical controls and seasonality in soil methane consumption, Chemosphere, 1996, vol. 32, pp. 2219‒2246.
Razali N.M., Wah Y.B., Power comparisons of Shapiro‒Wilk, Kolmogorov‒Smirnov, Lilliefors and Anderson‒Darling tests, J. Statist. Model. Analyt., 2011, vol. 2, pp. 21‒33.
Rumshiskii, L.Z., Matematicheskaya obrabotka rezul’tatov eksperimenta (Mathematical Treatment of Experimental Results), Moscow: Nauka, 1971.
Sabrekov, A.F., Glagolev, M.V., Kleptsova, I.E., Machida, T., and Maksyutov, S.S., Methane emission from mires of the West Siberian taiga, Euras. Soil Sci., 2013, vol. 46, no. 12, pp. 1182–1193.
Sabrekov, A.F., Runkle, B.R.K., Glagolev, M.V., Terentieva, I.E., Stepanenko, V.M., Kotsyurbenko, O.R., Maksyutov, S.S., and Pokrovsky, O.S., Variability in methane emissions from West Siberia’s shallow boreal lakes on a regional scale and its environmental controls, Biogeosci., 2017, vol. 14, pp. 3715‒3742.
Schoder, V., Himmelmann, A., and Wilhelm, K.P., Preliminary testing for normality: some statistical aspects of a common concept, Clin. Exp. Dermatol., 2006, vol. 31, pp. 757‒761.
Shieh, G., Jan, S., and Randles, R., On power and sample size determinations for the Wilcoxon–Mann–Whitney test, Nonparametric Statistics, 2006, vol. 18, pp. 33–43.
Striegl, R.G., Diffusional limits to the consumption of atmospheric methane by soils, Chemosphere, 1993, vol. 26, pp. 715‒720.
Talanova, G.I., Climate of the reserve “Malaya Sosva”: long-term material, Environ. Dynamics Global Climate Change, 2018, vol. 9, no. 1, pp. 22−45.
Tennant-Smith, J., Basic Statistics, Oxford: Butterworth-Heinemann, 1984.
Terentieva, I.E., Sabrekov, A.F., Ilyasov, D., Ebrahimi, A., Glagolev, M.V., and Maksyutov, S., Highly dynamic meth-ane emission from the West Siberian boreal floodplains, Wetlands, 2018. https://doi.org/10.1007/s13157-018-1088-4.
Tyurin, Yu.P. and Shmerling, D.S., Nonparametric methods in statistics, Sotsiologiya. 4M, 2004, no. 18, pp. 154‒166.
Veretennikova, E.E. and Dyukarev, E.A., Methane emission from peat deposits of oligotrophic bogs in the south taiga subzone of Western Siberia, in Torfyaniki Zapadnoi Sibiri i tsikl ugleroda: proshloe i nastoyashchee (Peat Swamps of Western Siberia: Past and Present, Proc. 4th Intl. Field. Symp.), Titlyanova, A.A. and Dergacheva, M.I., Eds., Tomsk: Tomsk Gos. Univ., 2014, pp. 157‒159.
Warner, R.M., Applied Statistics: From Bivariate through Multivariate Techniques, 2rd ed., Thousand Oaks: SAGE Publications, 2012.
Xu, X., Yuan, F., Hanson, P.J., Wullschleger, S.D., Thornton, P.E., Riley, W.J., Song, X., Graham, D.E., Song, C., and Tian, H., Reviews and syntheses: four decades of modeling methane cycling in terrestrial ecosystems, Biogeosci., 2016, vol. 13, pp. 3735–3755.
Zinchenko, A.V., Model of soil organic matter humification and mineralization and its application for calculation of peatland ecosystems carbon budget characteristics, Environ. Dynamics Global Climate Change, 2017, vol. 8, no. 2, pp. 3‒17. Translated by Е. Makeeva
ACKNOWLEDGMENTS
The work was supported by the Russian Science Foundation, project no. 16-14-10201.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare that they have no conflict of interest. This article does not contain any studies involving animals or human participants performed by any of the authors.
Rights and permissions
About this article
Cite this article
Kallistova, A.Y., Sabrekov, A.F., Goncharov, V.M. et al. On the Application of Statistical Analysis for Interpretation of Experimental Results in Environmental Microbiology. Microbiology 88, 232–239 (2019). https://doi.org/10.1134/S002626171902005X
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S002626171902005X