Skip to main content
Log in

Validation of null model tests using Neyman–Pearson hypothesis testing theory

  • Original paper
  • Published:
Theoretical Ecology Aims and scope Submit manuscript

Abstract

A long-standing question in ecology is whether interspecific competition affects co-occurrence patterns of species. Null model tests of presence–absence data (NMTPAs) constitute an important approach to address the question, but different tests often give conflicting results when applied to the same data. Neyman–Pearson hypothesis testing theory provides a rigorous and well accepted framework for assessing the validity and optimality of statistical tests. Here, I treat NMTPAs within this framework, and measure the robustness and bias of 72 representative tests. My results indicate that, when restrictive assumptions are met, existing NMTPAs are adequate, but for general testing situations, the use of all existing NMTPAs is unjustified — the tests are nonrobust or biased. For many current applications of NMTPAs, restrictive assumptions appear unmet, so these results illustrate an area in which existing NMTPAs can be improved. In addition to highlighting useful improvements to existing NMTPAs, the results here provide a rigorous framework for developing improved methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Besag J, Clifford P (1989) Generalized Monte Carlo significance tests. Biometrika 76:633–642

    Article  Google Scholar 

  • Berger JO (2003) Could Fisher, Jeffreys, and Neyman have agreed on testing? Stat Sci 18:1–32

    Article  Google Scholar 

  • Bickel PJ, Klaassen CA, Ritov Y, Wellner JA (1993) Efficient and adaptive estimation for semiparametric models. The Johns Hopkins University Press, Baltimore

    Google Scholar 

  • Bolger DT, Alberts AC, Soule ME (1991) Occurrence patterns of bird species in habitat fragments: sampling, extinction, and nested species subsets. Am Nat 137:155–166

    Article  Google Scholar 

  • Brown JH, Fox BJ, Kelt DA (2000) Assembly rules: desert rodent communities are structured at scales from local to continental. Am Nat 156:314–321

    Article  Google Scholar 

  • Brown JH, Kelt DA, Fox BJ (2002) Assembly rules and competition in desert rodents. Am Nat 160:815–818

    Article  PubMed  Google Scholar 

  • Casella G, Berger RL (2002) Statistical inference, 2nd edn. Duxbury, Pacific Grove

    Google Scholar 

  • Chaves LF, Anez N (2004) Species co-occurrence and feeding behavior in sand fly transmission of American cutaneous leishmaniasis in western Venezuela. Acta Trop 92:219–224

    Article  PubMed  Google Scholar 

  • Connor EF, Simberloff D (1978) Species number and compositional similarity of the galapagos flora and avifauna. Ecol Monogr 48:219–248

    Article  Google Scholar 

  • Connor EF, Simberloff D (1979) The assembly of species communities: chance or competition? Ecology 60:1132–1140

    Article  Google Scholar 

  • Connor EF, Simberloff D (1983) Interspecific competition and species co-occurrence patterns on islands: null models and the evaluation of evidence. Oikos 41:455–465

    Article  Google Scholar 

  • Cox DR, Hinkley DV (2000) Theoretical statistics. Chapman and Hall/CRC, Boca Raton

    Google Scholar 

  • Diamond JM (1975) Assembly of species communities. In: Ecology and evolution of communities. Harvard University Press, Cambridge, pp 342–344

    Google Scholar 

  • Feeley K (2003) Analysis of avian communities in Lake Guri, Venezuela, using multiple assembly rule models. Oecologia 137:104–113

    Article  PubMed  Google Scholar 

  • Fisher RA (1925) Statistical methods for research workers. Oliver and Boyd, Edinburgh

    Google Scholar 

  • Fox BJ (1999) The genesis and development of guild assembly rules for guilds. In: The search for assembly rules in ecological communities. Cambridge University Press, Cambridge, pp 23–57

    Google Scholar 

  • Fox BJ, Brown JH (1993) Assembly rules for functional groups in North American desert rodent communities. Oikos 67:358–370

    Article  Google Scholar 

  • Fox BJ, Brown JH (1995) Reaffirming the validity of the assembly rule for functional groups or guilds: a reply to Wilson. Oikos 73:125–132

    Article  Google Scholar 

  • Gilpin ME, Diamond JM (1982) Factors contributing to non-randomness in species co-occurrences on Islands. Oecologia 52:75–84

    Article  Google Scholar 

  • Gotelli NJ (2000) Null model analysis of species co-occurrence patterns. Ecology 81:2606–2621

    Article  Google Scholar 

  • Gotelli NJ, Ellison AM (2002) Assembly rules for New England ant assemblages. Oikos 99:591–599

    Article  Google Scholar 

  • Gotelli NJ, McCabe DJ (2002) Species co-occurrence: a meta-analysis of J. M. Diamond’s assembly rules model. Ecology 83:2091–2096

    Google Scholar 

  • Gotelli NJ, Rohde K (2002) Co-occurrence of ectoparasites of marine fishes: a null model analysis. Ecol Lett 5:86–94

    Article  Google Scholar 

  • Gotelli NJ, Buckley NJ, Wiens JA (1997) Co-occurrence of Australian land birds: Diamond’s assembly rules revisited. Oikos 80:311–324

    Article  Google Scholar 

  • Heino J, Soininen J (2005) Assembly rules and community models for unicellular organisms: patterns in diatoms of boreal streams. Freshw Biol 50:567–577

    Article  Google Scholar 

  • Kelt DA, Brown JH (1999) Community structure and assembly rules: confronting conceptual and statistical issues with data on desert rodents. In: The search for assembly rules in ecological communities. Cambridge University Press, Cambridge, pp 75–107

    Google Scholar 

  • Koopman KR (1958) Land bridges and ecology in bat distribution on islands off the Northern Coast of South America. Evolution 12:429–439

    Article  Google Scholar 

  • Ladau J, Schwager SJ (2008) Robust hypothesis tests for independence in community assembly. J Math Biol 57:537–555

    Article  PubMed  Google Scholar 

  • Lehmann EL (1993) The Fisher, Neyman-Pearson theories of testing hypotheses: one theory or two? J Am Stat Assoc 88:1242–1249

    Article  Google Scholar 

  • Lehmann EL, Romano JP (2005) Testing statistical hypotheses, 3rd edn. Springer, New York

    Google Scholar 

  • Meserve PL, Glanz WE (1978) Geographical ecology of small mammals in the Northern Chilean arid zone. J Biogeogr 5:135–148

    Article  Google Scholar 

  • Mouillot D, George-Nascimento M, Poulin R (2005) Richness, structure and functioning in metazoan parasite communities. Oikos 109:447–460

    Article  Google Scholar 

  • Patterson BD (1999) Nestedness temperature calculator program. http://aics-research.com/nestedness/tempcalc.html

  • Peres-Neto PR, Olden JD, Jackson DA (2001) Environmentally constrained null models: site suitability as occupancy criterion. Oikos 93:110–120

    Article  Google Scholar 

  • Pielou DP, Pielou EC (1968) Association among species of infrequent occurrence: the insect and spider fauna of Polyporus betulinus (Bulliard) fries. J Theor Biol 21:202–216

    Article  PubMed  CAS  Google Scholar 

  • Ribichich AM (2005) From null community to non-randomly structured actual plant assemblages: parsimony analysis of species co-occurrences. Ecography 28:88–98

    Article  Google Scholar 

  • Robert CP, Casella G (1999) Monte Carlo statistical methods. Springer, New York

    Google Scholar 

  • Robson DS (1972) Statistical tests of significance. Appendix to a paper by EC Pielou. J Theor Biol 34:350–352

    Google Scholar 

  • Ross SM (2006) Simulation, 4th edn. Academic, San Diego

    Google Scholar 

  • Sanders NJ, Gotelli NJ, Heller NE, Gordon DM (2003) Community disassembly by an invasive species. Proc Natl Acad Sci U S A 100:2474–2477

    Article  PubMed  CAS  Google Scholar 

  • Schervish MJ (1995) Theory of statistics. Springer, New York

    Google Scholar 

  • Schluter D (1984) A variance test for detecting species associations, with some example applications. Ecology 65:998–1005

    Article  Google Scholar 

  • Schoener TW, Adler GH (1991) Greater resolution of distributional complementarities by controlling for habitat affinities: a study with Bahamian lizards and birds. Am Nat 137:669–692

    Article  Google Scholar 

  • Silvey SD (2003) Statistical inference. CRC, Boca Raton

    Google Scholar 

  • Simberloff D, Connor EF (1981) Missing species combinations. Am Nat 118:215–239

    Article  Google Scholar 

  • Stone L, Roberts A (1990) The checkerboard score and species distributions. Oecologia 85:74–79

    Article  Google Scholar 

  • Stone L, Dayan T, Simberloff D (1996) Community-wide assembly patterns unmasked: the importance of species’ differing geographical ranges. Am Nat 148:997–1015

    Article  Google Scholar 

  • Stone L, Dayan T, Simberloff D (2000) On desert rodents, favored states, and unresolved issues: scaling up and down regional assemblages and local communities. Am Nat 156:322–328

    Article  Google Scholar 

  • Wilson JB (1987) Methods for detecting non-randomness in species co-occurrences: a contribution. Oecologia 73:579–582

    Article  Google Scholar 

  • Wilson JB (1995a) Null models for assembly rules: the Jack Horner effect is more insidious than the narcissus effect. Oikos 72:139–144

    Article  Google Scholar 

  • Wilson JB (1995b) Fox and Brown’s ‘random data sets’ are not random. Oikos 74:543–544

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by a NSF Graduate Research Fellowship, Cornell University, the Sierra Nevada Aquatic Research Laboratory (University of California, Santa Barbara), and the Santa Fe Institute. The author thanks S. J. Schwager, H. K. Reeve, and R. Knapp for extensive help and suggestions. This paper benefitted from insightful and useful comments from an anonymous reviewer.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joshua Ladau.

Appendix

Appendix

Null hypothesis

Various null hypotheses have been considered for NMTPAs, including null hypotheses that are consistent with species equivalence, site equivalence, and a lack of interspecific interactions. This paper considers the null hypothesis that species are distributed independently of each other (e.g., Connor and Simberloff 1983, p. 463). This null hypothesis is reasonable because interspecific interactions should cause species to occur nonindependently—for instance, with competition, species should be less likely to occur when their competitors are present. If desired, the approach applied here can easily be extended to other null hypotheses.

Let m and n denote the number of species observed and the number of sites sampled, respectively. The sample space, Ω, is the set of binary (0–1) presence–absence matrices with m rows and n columns. Let be the class of all probability measures on , where is the power set of Ω. For b ∈ Ω and i ∈ { 1,...,m} , define the following random variables on : let D i give the distribution of species i over the n sites; i.e., \(\mathbf{D}_{i} :\Omega \to \{ 0,1\} ^{n}\) such that D i (b) gives row i of b. Let f i be the distribution of D i , i.e., f i is the induced probability measure on { 0,1} n such that, for any d ∈ { 0,1} n , \( f_{i} (\mathbf{d})=P\{\mathbf{D}_{i}^{-1} (\mathbf{d})\}\). The null hypothesis is:

(1)

The alternative hypothesis is .

Models

The behavior of NMTPAs was investigated under four models (sets of assumptions). For i ∈ { 1,...,m} and j ∈ { 1,...,n} , let X ij :Ω→{ 0,1} be an indicator variable for the presence of species i at site j; i.e.,

$$ \label{m-1} X_{ij}(\mathbf{b})=\left\{\begin{array}{cc} {0} & {{\rm if\; element\; }i,j{\rm \; of\; }\mathbf{b}{\rm \; is\; 0}} \\ {1} & {{\rm if\; element\; }i,j{\rm \; of\; }\mathbf{b}{\rm \; is\; 1}} \end{array}\right . $$
(2)

Define \( p_{ij} \equiv P\{X_{ij}^{-1} (1)\}\), and let P be the m×n matrix whose i,jth entry is p ij . The first two models are:

(3)

and

(4)

2 and ℳ3 are defined analogously to ℳ1, except p ij  = p kj and p ij  = p il , respectively. As noted earlier, ℳ0 is the only model of these four that is generally realistic.

It is worth noting that models ℳ1 to ℳ3 also constrain the possible effects of interspecific interactions. By the definition of conditional probability, for all i,k ∈ { 1,...,m} and j ∈ { 1,...,n},

$$ \begin{array}{lll} \label{m-7} P(X_{ij}=1)= & P(X_{ij}=1|X_{kj}=1)P(X_{kj}=1) \\ & +P(X_{ij}=1|X_{kj}=0)P(X_{kj}=0). \end{array} $$
(5)

Model ℳ1 asserts that, for all i ∈ { 1,...,m} and j ∈ { 1,...,n}, there exists p ∈ [0,1] such that p ij  = p. Thus, it follows from (5) that

$$ \begin{array}{lll} \label{m-8} P(X_{ij}=1|&X_{kj}=1) \\ & =\left[p-P(X_{ij}=1|X_{kj}=0)(1-p)\right]/p. \end{array} $$
(6)

An absence of interspecific interactions implies that, for all i,k ∈ { 1,...,m} and j ∈ { 1,...,n}, P(X ij  = 1|X kj  = 1) = P(X ij  = 1|X kj  = 0). By contrast, competition might cause, for example, P(X 11 = 1|X 21 = 1) = 0.3, P(X 11 = 1| X 21 = 0) = 0.7, P(X 12 = 1|X 22 = 1) = 0.3, and P(X 12 = 1| X 22 = 0) = 0.7. When p = 0.5, this scenario is consis tent with Eq. 6, and hence, it is consistent with ℳ1. However, if P(X 11 = 1|X 21 = 1) = 0.3, P(X 11 = 1| X 21 = 0) = 0.7, P(X 12 = 1|X 22 = 1) = 0.4, and P(X 12 = 1| X 22 = 0) = 0.8, then there is no p that satisfies (6) for all i,k ∈ { 1,...,m} and j ∈ { 1,...,n}, so even though this scenario is consistent with competition, it is inconsistent with ℳ1. Similar conditions can be derived for ℳ2 and ℳ3. These conditions may be unrealistic in many situations: interspecific interactions may have effects different from those specified by the models.

Hypothesis tests

Let d 0 and d 1 be the decisions to accept and reject the null hypothesis (H 0 ), respectively. The nonrandomized hypothesis test δ:Ω→{ d 0 ,d 1 } has critical region \(\Omega _{1} \equiv \delta ^{-1} (d_{1} )\) (Bickel et al. 1993, chapter 1; Schervish 1995, chapter 4; Silvey 2003, chapter 6; Lehmann and Romano 2005, chapters 1–3).

Define the following random variables on the probability space :

$$ \label{ht-1} T:\Omega \to {\mathbb N} \mid T(\mathbf{b}){\rm \; gives\; the\; total\; of\; all\; entries\; of\; } \mathbf{b}. $$
(7)
$$ \label{ht-2} \mathbf{R}:\Omega \to {\mathbb N} ^{m} \mid \mathbf{R}(\mathbf{b}) {\rm \; gives\; the\; row\; sums\; of\; } \mathbf{b}. $$
(8)
$$ \label{ht-3} \mathbf{C}:\Omega \to {\mathbb N} ^{n} \mid \mathbf{C}(\mathbf{b}){\rm \; gives\; the\; column\; sums\; of\;} \mathbf{b} . $$
(9)

The tests are combinations of four test statistics, four estimation methods for P, and five conditioning methods. In a few cases, different combinations produce the same test, resulting in 72 rather than 80 distinct tests. The four statistics used (Gotelli 2000) are the checkerboard score (Diamond 1975), number of unique species combinations (“Combo;” Pielou and Pielou 1968), C-score (Stone and Roberts 1990), and V-ratio (Robson 1972), (Schluter 1984), denoted here by W 1(b) to W 4(b), respectively. For simplicity, W 2(b) and W 4(b) are defined as the negatives of the usual values of Combo and V-ratio, respectively, so all critical regions consist of large values.

The four estimators \(\hat{p}_{ij} \) for p ij (i ∈ { 1,...,m} and j ∈ { 1,...,n} ) follow from assuming that the \(\hat{p}_{ij} \) values are (1) all equal, (2) equal within rows and proportional to the observed column totals, (3) equal within columns and proportional to the observed row totals, or (4) proportional to both the observed row and column totals. These are given by: (1) \({\hat{p}_{ij} =T(\mathbf{b})/(mn)}\); (2) \({\hat{p}_{ij} =\sum _{k=1}^{m}X_{kj}(\mathbf{b}) \mathord{\left/ {\vphantom {\hat{p}_{ij} =\sum_{k=1}^{m}X_{kj} (\mathbf{b}) n}} \right. \kern-\nulldelimiterspace}n}\); (3) \({\hat{p}_{ij} =\sum _{l=1}^{n}X_{il}(\mathbf{b}) \mathord{\left/ {\vphantom {\hat{p}_{ij} =\sum_{l=1}^{n}X_{il} (\mathbf{b}) m}} \right. \kern-\nulldelimiterspace}m} \); and (4) \(\hat{p}_{ij} =\min \big\{1, \big\{\sum_{k=1}^{m}X_{kj} (\mathbf{b}) \big\} \big\{\sum _{l=1}^{n}X_{il}(\mathbf{b}) \big\} \left/ T(\mathbf{b})\right. \big\}\). For estimators 1 to 4, let \(\hat{P}_{1} ,...,\hat{P}_{4} \) be the respective probability measures on , such that \(\hat{P}(\mathbf{b})\equiv \prod _{i=1}^{m}\prod _{j=1}^{n}\hat{p}_{ij}^{X_{ij} (\mathbf{b})} (1-\hat{p}_{ij} )^{1-X_{ij} (\mathbf{b})} \).

Five sets of tests Δ1 ,...,Δ5 can then be defined by conditioning on different events. Each set contains 16 tests, corresponding to the 16 combinations of W ∈ { W 1 ,..,W 4 } and \(\hat{P}\in \{ \hat{P}_{1} ,...,\hat{P}_{4} \} \). In some cases, different \(\hat{P}\in \{ \hat{P}_{1} ,...,\hat{P}_{4} \} \) give identical tests, resulting in fewer than 16 distinct tests. The five sets are:

$$ \label{ht-4} \Delta _{1} \equiv \big\{ \delta : \Omega _{1} = \big\{ \mathbf{b}:\mathbf{b}\in \Omega ;\; \hat{P}\{W\le W(\mathbf{b})\}\le \alpha \big\} \big\} , $$
(10)
$$ \label{ht-5} \Delta _{2} \equiv \big\{ \delta : \Omega _{1} = \big\{ \mathbf{b}: \mathbf{b}\in \Omega ;\; \,\,\hat{P}\{W\le W(\mathbf{b})|T=T(\mathbf{b})\}\le \alpha \big\} \big\} , $$
(11)
$$ \label{ht-6} \Delta _{3} \equiv \big\{ \delta : \Omega _{1} = \big\{ \mathbf{b}: \mathbf{b}\in \Omega ;\; \,\, \hat{P}\{W\le W(\mathbf{b})|\mathbf{R}=\mathbf{R}(\mathbf{b})\}\le \alpha \big\} \big\} , $$
(12)
$$ \label{ht-7} \Delta _{4} \equiv \big\{ \delta : \Omega _{1} = \big\{ \mathbf{b}: \mathbf{b}\in \Omega ;\; \,\, \hat{P}\{W\le W(\mathbf{b})|\mathbf{C}=\mathbf{C}(\mathbf{b})\}\le \alpha \big\} \big\} , $$
(13)

and

$$ \label{ht-12} \Delta _{5} \equiv \big\{ \delta : \Omega _{1} =\{\mathbf{b}:\mathbf{b}\in \Omega ;\; \,\, \hat{P}\{W\le W(\mathbf{b})|\mathbf{R}=\mathbf{R}(\mathbf{b}),\; \mathbf{C}=\mathbf{C}(\mathbf{b})\}\le \alpha \big\} \big\} . $$
(14)

The motivation for the tests in Δ1 to Δ5 is discussed elsewhere (e.g., Connor and Simberloff 1979; Gilpin and Diamond 1982; Gotelli 2000). It is routine to verify that \(\delta _{CS} \!=\! (\delta :\delta \in \Delta _{5} ,\; W\!=\!W_{3} ,\; \hat{P}\!=\!\hat{P}_{1} )\); δ GD  = \((\delta :\delta \in \Delta _{1} ,\; W=W_{1} ,\; \hat{P}=\hat{P}_{4} )\); and δ G  = (δ:δ ∈ Δ3, \(W=W_{3} ,\; \hat{P}=\hat{P}_{1} )\).

Bias and robustness

Let α be a number between 0 and 1 that gives the maximum tolerable type I error rate. The size of δ under model ℳ is

(15)

If α < α T , then δ is nonrobust under ℳ. Define

(16)

A test is unbiased under ℳ if β I  ≥ α T (Schervish 1995, chapter 4; Casella and Berger 2002, p. 387; Silvey 2003, chapter 6; Lehmann and Romano 2005, chapter 4). The specific aims of this study were threefold: under ℳ0 to ℳ3 (1) to find lower bounds for α T , (2) to find upper bounds for β I , and (3) to examine whether the bounds depend on m and n.

Numerical procedures

For finding lower bounds for α T under ℳ j ∈ { ℳ0,..., ℳ3 }, P1) was evaluated for several measures P ∈ H 0 ∩ ℳ j . The measures and the models to which they belong are given in Fig. 2. The measures correspond to the following ecological scenarios: P 1 reflects the situation that all species are equally likely to occur at all sites. P 2 reflects the situation that roughly half of the sites are hospitable while the other half are inhospitable. P 3 reflects the situation that roughly half the species are unlikely to occur, while the other half are likely to occur. P 4 to P 8 reflect other situations of interspecific and spatial heterogeneity. For convenience, the entire set of these measures, which is a subset of H 0, is denoted \(\hat{H}_0\).

Fig. 2
figure 2

Values of P used in defining \(\hat{H}_{0} \equiv {P_{1} ,...,P_{8}}\), and \(\hat{H}_{A} \equiv {P_{9} ,...,P_{13}}\). Each square represents the value of P for the given measure. Light and dark shading represents low and high values of p ij , respectively. Values on the axes indicate row and column numbers. [ ·] denotes the floor function. For i ∈ { 1,...,m} , j ∈ { 1,...,n} , and k ∈ { 9,...,12} , P k (X ij  = 1|X i − 1,j = 0) = 1.5p ij . The measures are elements of the following models: P 1 ∈ ℳ0 ∩ ℳ1 ∩ ℳ2 ∩ ℳ3 ; P 2 ∈ ℳ0 ∩ ℳ2 ; P 3 ∈ ℳ0 ∩ ℳ3; P 4, P 5,...,P 8 ∈ ℳ0; P 9, P 10,P 11 ∈ ℳ0 ∩ ℳ1 ∩ ℳ2 ∩ ℳ3 ; P 12 ∈ ℳ0 ∩ ℳ2 ; and P 13 ∈ ℳ0 ∩ ℳ3

To find upper bounds for β I under ℳ j ∈ { ℳ0,..., ℳ3 }, P1) was evaluated for several choices of P ∈ H A  ∩ ℳ j . The measures are defined in Fig. 2, and they correspond to the following ecological scenarios: P 9 , P 10 , and P 11 reflect the situation that all species are equally likely to occur at all sites. P 12 reflects the situation that roughly half of the sites are less hospitable than the other half. P 13 reflects the situation that roughly half of the species are more likely to occur than the other half. All of these measures reflect a scenario in which each even-numbered species is 1.5 times as likely to occur if the preceding species is absent than if it is present. Similar measures, wherein even-numbered species were twice as likely to occur, were also considered. Such dependence is consistent with widespread competitive interactions. The set of these measures is denoted \(\hat{H}_A\).

For the above tests and ℳ j ∈ { ℳ0,...,ℳ3 }, define

(17)

and

(18)

For convenience, when ℳ j is implied, the subscript j is omitted from \(\hat{\alpha }_{T,j}\) and \(\hat{\beta }_{I,j}\). Because \(\hat{H}_{0} \subseteq H_{0} \), \(\hat{\alpha }_{T} \le \alpha _{T} \). Hence, a test is concluded to be nonrobust under ℳ j if \(2\alpha \le \hat{\alpha }_{T} \), a generous criterion because one would not want to make type I errors at a rate of twice the tolerable level. Likewise, because \(\hat{H}_A \subseteq H_{A} \), \(\beta _{I} \le \hat{\beta }_{I} \). Hence, a test is concluded to be biased under ℳ j if \(2\hat{\beta }_{I} \le \hat{\alpha }_{T} \), as this implies that β I  < α T , again a generous criterion.

To find \(\hat{\alpha }_{T} \) and \(\hat{\beta }_{I} \), for each ℳ j ∈ { ℳ0,...,ℳ3 }, \(\delta \in \bigcup _{i=1}^{5}\Delta _{i} \), and \(P \in ( \hat{H}_{0} \cup \hat{H}_{A}) \cap\) j , I estimated P1 ). To do so, I began by setting α = 0.05. For each combination of δ and P , I generated 1,000 matrices from P . I then checked whether each of these matrices was an element of Ω1 by generating 1,000 random matrices from the appropriate \(\hat{P}\). I estimated P1 ) by the proportion of matrices from P that were elements of Ω1 . For tests that conditioned on different statistics but were otherwise identical, I reused empirical distributions. I generated empirical distributions using a Markov chain Monte Carlo/Hastings-Metropolis algorithm (Besag and Clifford 1989; Robert and Casella 1999; Ross 2006, chapter 10). The algorithms were implemented in Visual Basic 6.0 (code available from the author).

To assess whether bias and robustness depend on matrix dimensions, I examined six values of (m,n), the dimensions of the presence–absence matrices in Ω. The dimensions were from presence–absence matrices that have been analyzed previously: 11×25, 17×19, 20×20 (all analyzed in Gotelli 2000), 5×30, 8×9, and 55×7 (all analyzed in Gotelli and McCabe (2002); source data, respectively: Bolger et al. (1991), Meserve and Glanz (1978), and Koopman (1958)).

Analytic results

It can be shown that (1) T is minimally sufficient for P under ℳ1, (2) R is minimally sufficient for P under ℳ2, and (3) C is minimally sufficient for P under ℳ3. Hence, tests that condition (1) on T under ℳ1, (2) on R under ℳ2, and (3) on C under ℳ3 may have Neyman Structure (Cox and Hinkley 2000, p. 135; Lehmann and Romano 2005, p. 115). This structure is indeed present in the following tests under the following models: \(\delta :\delta \in \Delta _{2} ,\hat{P}=\hat{P}_{1} \) under ℳ1; \(\delta :\delta \in \Delta _{3} ,\hat{P}=\hat{P}_{1} \) under ℳ1 and ℳ3; \(\delta :\delta \in \Delta _{4} ,\hat{P}=\hat{P}_{1} \) under ℳ1 and ℳ2; \(\delta :\delta \in \Delta _{5} ,\hat{P}=\hat{P}_{1} \) under ℳ1, ℳ2, and ℳ3; \(\delta :\delta \in \Delta _{3} ,\hat{P}=\hat{P}_{3} \) under ℳ3, and \(\delta :\delta \in \Delta _{4} ,\hat{P}=\hat{P}_{4} \) under ℳ2. It follows that, under these models, α T  = α for these tests (Cox and Hinkley 2000, chapter 5).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ladau, J. Validation of null model tests using Neyman–Pearson hypothesis testing theory. Theor Ecol 1, 241–248 (2008). https://doi.org/10.1007/s12080-008-0024-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12080-008-0024-2

Keywords

Navigation