# How to walk on statistical mandalas as a population ecologist

## Abstract

We population ecologists who are believed to be good at dealing with statistics often get confused about what kinds of statistical methods we should apply to our nuisance data. There are a couple of conflicting paradigms and many associated methods in statistics. Classical frequentists’ approaches that have dominated in science have been severely criticized by the newcomers: Bayesian and evidential statistics. But, both newcomers also have weak points. Researchers devoted to different statistical approaches are seeking soft landing places where they can compromise each other. Key aspects of statistical inference are discriminating model selection and parameter estimation. Likelihood and Fisher information play important roles in both processes. As an overview of the compromise processes, here I will introduce three contributing papers by M. L. Taper, J. M. Ponciano, R. M. Dorazio, and K. Yamamura for the special feature entitled “Bayesian, Fisherian, error, and evidential statistical approaches for population ecology.” This special feature is based on a symposium held in Tsukuba, Japan, on 11 October 2014

## Keywords

Bayesian Evidential Fisherian Frequentist Model selection Parameter estimation## Introduction

When a non-native-English-speaking scientist submits his/her manuscript to an international ecological journal, he/she often asks English proofreading of a professional or of his/her native-English-speaking colleague. However, interestingly, there are few authorized systems to encourage proofreading of statistical methods. One reason for this trend might be that statistical methods have no authorized standards as does scientific English. Statistical practices, and in some cases paradigms, are quite different among scientific fields. Population ecologists, who are believed to be relatively better at statistics than ecologists specializing in other fields, also have to consider which statistical methods and paradigms they should apply to their own researches. Are we population ecologists actually good at statistics? I would say no. Most of us only specialize in specific statistical methods and paradigms.

That would be why Dr. Takashi Saitoh who was the president of the Society of Population Ecology asked Dr. Kohji Yamamura and myself to organize a special feature on statistics of population ecology from a broad perspective. In this introductory review, I briefly list questions and concerns about statistics that I have felt during my career as a population ecologist. I first discuss the dominance of classical frequentist approaches in chronological order for which they appeared for me personally, then I briefly discuss the two newcomers, Bayesian and evidential statistics, and finally, I introduce the three contributing articles for this special feature. This special feature is based on a symposium held in Tsukuba, Japan, on 11 October 2014.

## Dominance of classical frequentist approaches

The vast majority of textbooks on statistics in the library of my university in the middle of the 1980s were, and might still be, classified as classical frequentist statistics. Here “classical frequentist” refers to non-Bayesian or non-evidential, and mainly consists of null-hypothesis testing and *P* value worship approaches that assume normal distributions of original or transformed target variables. As other students of population ecology, I had to start learning classical statistics when I was a graduate student. I have always wondered why regressions and ANOVA-type methods have two steps: significance tests of explanatory variables for the data variation as a whole followed by significance tests for parameters or means of sub-units. Even for a simple one-way ANOVA test for three categories, once we detect a significant difference among the three categories, we cannot simply claim that the largest mean value for a category is larger than those of the other two categories. I was taught that we had to perform appropriate post hoc tests even when the plot of mean values clearly showed the difference.

The former step is model fitting, and the latter one is parameter estimation. These two steps sometimes invoke different statistical methods, e.g., model fitting with information criteria, such AIC or BIC, and parameter estimation with Bayesian methods. The former requires post hoc tests to compare parameters of the best models, but the latter can spontaneously compare multiple parameters after obtaining their posterior probabilities by checking the overlap of their posterior distributions. post hoc tests are a variant of multiple comparison (Hsu 1998). Multiple comparison per se does not inherently mean post hoc tests, and there are relevant a priori tests of multiple comparison. The difference between post hoc and a priori comparison is the epistemological attitude towards data collection by researchers. If one designed the comparison before his/her data collection, the test is a priori but it should be treated as post hoc if one did the comparison after his/her data collection. This epistemological difference would affect the complicatedness of calculating appropriate variances in the comparison. Much simpler methods of post hoc comparison, for example the Bonferroni test or its variants (Holm 1979; Moran 2003), often require some kind of programming skills, so one would preferably be able to claim, “I did design the comparison beforehand!”

*bossa nova*of statistics. At the time, besides discriminating egg shapes of two bean beetle species (Taper and Ponciano 2015), Dr. Taper was struggling with quantitative genetic problems using MANOVA (Taper 1990). He was always aware of the statistical power of constructed statistical models. He often questioned me about how many replicates we needed to obtain significant differences among treatments considering statistical power. He recommended that I read a textbook by Dr. Jerrold H. Zar (Zar 1984) rather than Biometory (Sokal and Rohlf 1981). Zar’s book (2nd ed.) was, as far as I knew, the only book that started the first chapter with frequency data analysis, which taught me the meaning of degrees of freedom.

At the beginning of the 1990s, there was a small boom of randomization statistics (Noreen 1989; Good 1993; Edgington 1995; Manly 1997) among young behavioral ecologists in Japan. Dr. Eiiti Kasuya and his collaborators claimed, “from now on, randomization will take over those classical statistics such as ANOVA and multiple regressions.” They emphasized that randomization methods were custom made, so we could adjust statistics so as to ask any question and judge any problems. Randomization tests were first innovated by Dr. Ronald A. Fisher (Salsburg 2001) and extensively developed by Dr. Bradley Efron (Efron 1982; Hall 1992) as the jackknife, bootstrap, and other resampling methods for reconstructing parameter distribution of populations. One can reconstruct the background distribution believed to exist by simply or honestly resampling obtained data. It is just like believing that nature is full of fractals (Peitgen et al. 1992). Resampling plans need sophisticated stratification of variables if you have problems with multiple variables. I was not sure how to apply randomization tests to all of the statistical problems illustrated in Dr. Minaka’s mandala (Fig. 1).

In the middle of the 1990s, many population ecologists in Japan routinely used generalized linear models (GLM; e.g., Dunteman 1984; Dobson 1990; Crawley 1993) for their analyses. They fit models to their data, and examined parameter values for the models. Some models showed quite low powers of explanation, or had low adjusted or generalized determination coefficient (*R* ^{2}) values (Nagelkerke 1991), but their discussions were based on highly significant parameters of the models. Some researchers applied information criteria, such AIC and its variants, but again they derived conclusions from significant parameters even though there might have been alternative models with similar AIC values. Model selection and the following parameter summarization were somehow estranged from one another.

Significance tests for parameters often ask whether the parameter values are greater or less than zero. We all know the criticisms against the silly null hypothesis that reflect a lack of thinking about plausible alternatives, so finding little/no support for the nulls does little to provide evidence for the alternatives (Burnham et al. 2011). So we perhaps forget the criticisms when we perform GLMs. Earnest population ecologists are aware of random effects as well as fixed effects, but decisions on whether factors are fixed or random effects are often arbitrary (Royle and Dorazio 2008). Not a few articles encourage scientists to get rid of *P* values and testing between null-model and non-null-model hypotheses (e.g., Anderson et al. 2000; Stephens et al. 2005). Recently the scientific journal, “Basic and Applied Social Psychology,” has gone so far as to ban *P* value significance tests (Trafimow and Marks 2015)! But many scientific articles still adopt classical statistical methods. This situation resembles that of Mac and Linux users blaming Windows because of its inability to stop malware proliferation, while at the same time, Windows users make up the vast majority of the world’s computer-using population.

## *Bossa*-*nova* statistics from Bayesian and evidential approaches

Bayesian approaches are the most recent trend for population ecology (e.g., Ellison 2004; Qian and Shen 2007). As for randomization methods, evangelists of Bayesian statistics claimed that “everything is solved with Bayesian” (e.g., Albert 2007; McCarthy 2007; Gill 2008). Several Bayesian introductory textbooks criticize classical approaches, sometimes even consuming an entire chapter, and introduce Bayesian methods as a replacement for all of them (e.g., McCarthy 2007; McGrayne 2011). Some extremist opinions claim that Bayesian philosophy cannot coexist with classical philosophy (e.g., Ellison 2004). There was, in fact, stubborn resistance against Bayesian approaches from old schools of thought (e.g., Yamamura 2015). Students would ask, “well, we can obtain posterior distribution of target parameters, but how can we say those parameters are significantly different from zero?” Some textbooks even introduce significance tests in terms of Bayesian approaches (e.g., Albert 2007). “Then which model should we select?” is another question. Bayes factor, DIC and BIC have been proposed, but there exist *pros* and *cons* for each of them (Ward 2008; Spiegelhalter et al. 2014; Hooten and Hobbs 2015). On the other hand, there are more moderate Bayesian evangelists that would not mind combining Bayesian with other, even classical, approaches (e.g., Bolker 2008; Royle and Dorazio 2008; Qian 2010).

As the rise of randomization approaches heavily depended on advances in computer sciences, new and practical Bayesian approaches, such Markov chain Monte Carlo (MCMC, Dorazio 2015) and Hamiltonian Monte Carlo (Stan Development Team 2015) have been enabled by progress in calculation techniques with computers. Development of Bayesian-statistics-oriented languages, such OpenBUGS, WinBUGS, JAGS, and Stan, also accelerated the spread of Bayesian approaches (Kruschke 2011; Kéry and Schaub 2012; Stan Development Team 2015). After copying BUGS scripts from books, adjusting parameters for prior probability of one’s data, and then calculating the statistical scripts, posterior distributions are returned. It is often recommended to check states of convergence of the posterior distribution by trace plots or \(\hat{R}\) values (Gelman and Rubin1992), but those checks do not guarantee parameter convergence (Dorazio 2015).

Evidential statistical approaches appear more modest in propagation than Bayesian and other approaches (Taper and Lele 2004). They mainly rely on the invariant characteristic of maximum likelihood or variants of information criteria, and provide simple but clear ways to tell which models should be selected. Interestingly, all the following tools were invented by Dr. Fisher: *P* value, randomization test, ANOVA, and maximum likelihood estimates. Dr. Fisher himself strongly criticized Bayesian approaches (e.g., McGrayne 2011), but evidential approaches seek a harmonious collaboration with Bayesian methods as well as with classical methods. So far there seems not to have been any big booms in evidential approaches in Japan or in other regions of the world.

## Walking through Bayesian, Fisherian, error, and evidential statistical approaches

*t*test or an ANOVA. On the right-hand side, we have plenty of data, but they are too entangled to apply a simple ANOVA or even a MANOVA. So, for situations represented by the left-hand side, proper guidance would be, “collect more data!” How much data is necessary to shift into the gray rectangle region? And, what about situations in which, we cannot collect more data? Non-parametric methods, and sometimes Bayesian methods, are often invoked to support small sample sizes (e.g., Hinton 2004). Note that neither non-parametric nor Bayesian methods were invented for that purpose (Neave and Worthington 1988; Noether 1991; Sprent 1993; Salsburg 2001).

The problem is more serious if your data are located on the right-hand side of the shaded rectangle in Fig. 3. Explanatory variables are complicatedly correlated, and variables to be explained are also highly entangled. Applying classification methods, such cluster analyses and correspondence analyses, may reveal distant relationships among the variables, but some criteria for grouping them are necessary. The proper guidance for such situations is merely “muddle through whatever tools you have!” (Taper and Ponciano 2015). One way of “muddling through” might be to construct hierarchical models with the Bayesian method or variants of GLM methods. But still one should be aware of the non-identifiability problem (Raue et al. 2013). MCMC methods are so powerful, and output tentative posterior probability of parameters; however this may be scientifically nonsensical (see Dorazio 2015; Taper and Ponciano 2015).

This Special Feature is another, albeit non-visualized and rather verbal, mandala. You have to read through it, but after that, you will be able to visualize your own image in order to solve your statistical problems. In this Special Feature, we have three contributing papers by four statistics experts from different disciplines: Dr. Mark L. Taper and Dr. José M. Ponciano from evidential statistics, Dr. Robert M. Dorazio from Bayesian statistics, and Dr. Kohji Yamamura from Fisherian statistics.

Dr. Taper and Dr. Ponciano first overview different statistical approaches: Fisherian, Bayesian, error, and evidential, in terms of population ecology. Their long introduction shows conflicts among the approaches from methodological as well as philosophical points of view. Then, they discuss the evidential statistical approach in depth. This explanation might be a good place to start for those have never heard the name, “evidential statistical approach.” The final section is a detailed list of misunderstandings and confusion of statistics in general, with which population ecologists will no doubt be confronted at some point in their research. Readers might be willing to compare these comments with previous ones from different points of view (e.g., Burnham et al. 2011).

Dr. Dorazio demonstrates contemporary views and attitudes of Bayesian approaches. Based on the learning aspects of Bayesian approaches, he tries to persuade us that “hierarchical modeling” is an engine for current research in the field of population ecology. He strongly recommends Bayesian approaches as a first-choice method. He is not a fanatical Bayesian evangelist at all, and discusses the *pros* and *cons* of Bayesian approaches. In particular, he admits that the weakness in choosing prior probability and model comparison has not yet been solved solely within Bayesian approaches, and hence, he recommends combinations with other statistical approaches for those issues. He also provides brief but lucid explanations of MCMC techniques, which most users of Bayesian software packages leave them as black-boxes. His examples are very useful and practical even for Bayesian beginners.

Dr. Yamamura describes himself as a Fisherian rather than a frequentist. He has repeatedly claimed in academic meetings that “Bayesian estimates can be used as an approximation to maximum likelihood (ML) estimates,” which becomes the title of his article. His main criticism of Bayesian approaches is the mal-effects of inappropriate prior probabilities of parameters. He then proposes a Bayesian approximation of objective ML with appropriate transformation that makes the posterior distribution close to a normal one. He explains his idea, named as “empirical Jeffreys prior,” with a practical example of sika deer populations in Hokkaido, Japan. The approximation method is, as Dr. Taper has repeatedly indicated, believed to have a tight relationship with data cloning (Lele et al. 2007).

After reading through the above three articles, I am convinced that readers will have a better understanding of what model selection is and of what parameter estimation is, as well as learn what kinds of tools, such ML and Bayesian procedures, have been implemented for those purposes. Discriminating as well as properly combining (not confusing) these two aspects will work as a compass as readers “muddle through” the mandalas of statistics.

## Notes

### Acknowledgments

I thank Mark L. Taper, José M. Ponciano, Robert M. Dorazio, and K. Yamamura for their contributions for this Special Feature. I thank E. Kasuya for his helpful comments on an earlier manuscript. I thank N. Minaka and B. Efron for allowing me to modify and reuse their mandalas. This study was supported in part by Grant-in-Aids for Scientific Research (26440233) to YT from JSPS.

## References

- Albert J (2007) Bayesian computation with R. Springer, New YorkCrossRefGoogle Scholar
- Anderson DR, Burnham KP, Thompson W (2000) Null hypothesis testing: problems, prevalence, and an alternative. J Wildl Manag 64:912–923CrossRefGoogle Scholar
- Bolker BM (2008) Ecological models and data in R. Princeton University Press, Princeton, New JerseyGoogle Scholar
- Burnham KP, Anderson DR, Hyuvaert KP (2011) AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behav Ecol Sociobiol 65:23–35CrossRefGoogle Scholar
- Crawley MJ (1993) GLIM for ecologists. Blackwell Scientific Publications, OxfordGoogle Scholar
- Dobson AJ (1990) An introduction to generalized linear models. Chapman & Hall, LondonCrossRefGoogle Scholar
- Dorazio RM (2015) Bayesian data analysis in population ecology: Motivations, methods, and benefits. Popul Ecol. doi: 10.1007/s10144-015-0503-4 Google Scholar
- Dunteman GH (1984) Introduction to linear models. Sage, Beverly HillsGoogle Scholar
- Edgington ES (1995) Randomization tests. Marcel Dekker, New YorkGoogle Scholar
- Efron B (1982) The jackknife, the bootstrap and other resampling plans. SIAM, PhiladelphiaCrossRefGoogle Scholar
- Efron B (1998) R. A. fisher in the 21st century. Stat Sci 13:95–122CrossRefGoogle Scholar
- Ellison AM (2004) Bayesian inference in ecology. Ecol Lett 7:509–520CrossRefGoogle Scholar
- Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–472CrossRefGoogle Scholar
- Gill J (2008) Bayesian methods: a social and behavioral sciences approach. Chapman & Hall/CRC, Boca RatonGoogle Scholar
- Good P (1993) Permutation tests: a practical guide to resampling methods for testing hypotheses. Springer, New YorkGoogle Scholar
- Hall P (1992) The bootstrap and edgeworth expansion. Springer, New YorkCrossRefGoogle Scholar
- Hinton PR (2004) Statistics explained, 2nd edn. Routledge, LondonGoogle Scholar
- Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70Google Scholar
- Hooten MB, Hobbs NT (2015) A guide to Bayesian model selection for ecologists. Ecol Monogr 85:3–28CrossRefGoogle Scholar
- Hsu JC (1998) Multiple comparisons: theory and methods. Chapman & Hall, LondonGoogle Scholar
- Kéry M, Schaub M (2012) Bayesian population analysis using WinBUGS: a hierarchical perspective. Academic Press, BurlingtonGoogle Scholar
- Kruschke JK (2011) Doing Bayesian data analysis: a tutorial with R and BUGS. Academic Press, BurlingtonGoogle Scholar
- Lele SR, Dennis B, Lutscher F (2007) Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov Chain Monte Carlo methods. Ecol Lett 10:551–563PubMedCrossRefGoogle Scholar
- Manly BF (1997) Randomization, bootstrap and Monte Carlo methods in biology, 2nd edn. Chapman & Hall, Boca RatonGoogle Scholar
- McCarthy MA (2007) Bayesian methods for ecology. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- McGrayne SB (2011) The theory that would not die. Yale University Press, New HavenGoogle Scholar
- Moran MD (2003) Arguments of rejecting the sequential Bonferroni in ecological studies. Oikos 100:403–405CrossRefGoogle Scholar
- Nagelkerke NJD (1991) A note on a general definition of the coefficient of determination. Biometrika 78:691–692CrossRefGoogle Scholar
- Neave HR, Worthington PL (1988) Distribution-free tests. Routledge, LondonGoogle Scholar
- Noether GE (1991) Introduction to statistics: the nonparametric way. Springer, New YorkCrossRefGoogle Scholar
- Noreen EW (1989) Computer intensive methods for testing hypotheses: an introduction. Wiley, New YorkGoogle Scholar
- Peitgen HO, Jürgens H, Saupe D (1992) Chaos and fractals: new frontiers of science. Springer, New YorkCrossRefGoogle Scholar
- Qian SS (2010) Environmental and ecological statistics with R. Chapman and Hall/CRC Press, Boca RatonGoogle Scholar
- Qian SS, Shen Z (2007) Ecological applications of multilevel analysis of variance. Ecology 88:2489–2495PubMedCrossRefGoogle Scholar
- Raue A, Kreutz C, Theis F, Timmer J (2013) Joining forces of Bayesian and frequentist methodology: a study for inference in the presence of non-identifiability. Philos Trans R Soc A 371:20110544CrossRefGoogle Scholar
- Royle JA, Dorazio RM (2008) Hierarchical modeling and inference in ecology. Academic Press, AmsterdamGoogle Scholar
- Salsburg D (2001) The lady tasting tea: how statistics revolutionized science in the twentieth century. Owl Books, New YorkGoogle Scholar
- Sokal RR, Rohlf FJ (1981) Biometry, 2nd edn. W. H. Freeman & Company, San FranciscoGoogle Scholar
- Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A (2014) The deviance information criterion: 12 years on. J R Stat Soc B 76:485–493CrossRefGoogle Scholar
- Sprent P (1993) Applied nonparametric statistical methods, 2nd edn. Chapman & Hall, LondonGoogle Scholar
- Stan Development Team (2015) Stan modeling language users guide and reference manual, ver. 2.7.0Google Scholar
- Stephens PA, Buskirk SW, Hayward GD, del Rio CM (2005) Information theory and hypothesis testing: a call for pluralism. J Appl Ecol 42:4–12CrossRefGoogle Scholar
- Taper ML (1990) Experimental character displacement in the adzuki bean weevil,
*Callosobruchus chinensis*. In: Fujii K, Gatehouse AMR, Johnson CD, Mitchel R, Yoshida T (eds) Bruchids and legumes: economics, ecology and coevolution. Kluwer Academic, Dordrecht, pp 289–301CrossRefGoogle Scholar - Taper ML, Lele SR (eds) (2004) The nature of scientific evidence: statistical, philosophical, and empirical considerations. University of Chicago Press, ChicagoGoogle Scholar
- Taper ML, Ponciano JM (2015) Evidential statistics as a statistical modern synthesis to support 21st century science. doi: 10.1007/s10144-015-0533-y
- Trafimow D, Marks M (2015) Editorial. Basic Appl Soc Psychol 37:1–2CrossRefGoogle Scholar
- Ward EJ (2008) A review and comparison of four commonly used Bayesian and maximum likelihood model selection tools. Ecol Model 211:1–10CrossRefGoogle Scholar
- Yamamura K (2015) Bayes estimates as an approximation to maximum likelihood estimates. Popul Ecol. doi: 10.1007/s10144-015-0526-x Google Scholar
- Zar JH (1984) Biostatistical analysis, 2nd edn. Prentice-Hall Inc., Englewood CliffsGoogle Scholar