Environmental and Ecological Statistics

, Volume 3, Issue 4, pp 329–347 | Cite as

Obtaining species: sample size considerations

  • Trent L. McDonald
  • David S. Birkes
  • N. Scott Urquhart


Suppose fish are to be sampled from a stream. A fisheries biologist might ask one of the following three questions: ‘How many fish do I need to catch in order to see all of the species?’, ‘How many fish do I need to catch in order to see all species whose relative frequency is more than 5%?’, or ‘How many fish do I need to catch in order to see a member from each of the species A, B, and C?’. This paper offers a practical solution to such questions by setting a target sample size designed to achieve desired results with known probability. We present three sample size methods, one we call ‘exact’ and the others approximate. Each method is derived under assumed multinomial sampling, and requires (at least approximate) independence of draws and (usually) a large population. The minimum information needed to compute one of the approximate methods is the estimated relative frequency of the rarest species of interest. Total number of species is not needed. Choice of a sample size method depends largely on available computer resources. One approximation (called the ‘Monte Carlo approximation’) gets within ±6 units of exact sample size, but usually requires 20–30 minutes of computer time to compute. The second approximation (called the ‘ratio approximation’) can be computed manually and has relative error under 5% when all species are desired, but can be as much as 50% or more too high when exact sample size is small. Statistically, this problem is an application of the ‘sequential occupancy problem’. Three examples are given which illustrate the calculations so that a reader not interested in technical details can apply our results.


multinomial distribution occupancy problem species richness urn model 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bunge, J. and Fitzpatrick, M. (1993) Estimating the number of species: A reiview. Journal of the American Statistical Association, 88(421), 364–73.Google Scholar
  2. Heck, K. L., Belle, G.V. and Simberloff, D. (1975) Explicit calculation of the rarefaction diversity measurement and the determination of sufficient sample size. Ecology, 56, 1456–61.Google Scholar
  3. Johnson, N.L. and Kotz, S. (1977) Urn Models and Their Application. Wiley, New York.Google Scholar
  4. Kolchin, V.F., Sevast'yanov, B.A. and Chistyakov, V.P. (1978) Random Allocations. V. H. Winston & Sons, Washington, DC.Google Scholar
  5. Krebs, C.J. (1989) Ecological Methodology. Harper and Row, New York.Google Scholar
  6. Nath, H.B. (1973) Waiting time in the coupon-collector's problem. Australian Journal of Statistics, 15(2), 132–5.Google Scholar
  7. Nath, H.B. (1974) On the collector's sequential sample size. Trabajos de estadística, 25, 85–8.Google Scholar
  8. Riordan, J. (1968) Combinatorial Identities. Wiley, New York.Google Scholar

Copyright information

© Chapman & Hall 1996

Authors and Affiliations

  • Trent L. McDonald
    • 1
  • David S. Birkes
    • 2
  • N. Scott Urquhart
    • 2
  1. 1.West Inc.CheyenneUSA
  2. 2.Department of StatisticsOregon State UniversityCorvallisUSA

Personalised recommendations