The consumer and resource distributions of the trophic species (Cohen et al. 1990) in 51 food webs were analyzed. The data are all the webs with 25 or more trophically distinct taxa (Cohen et al. 1990) from two recent studies (Stouffer et al. 2007; Thompson et al. 2007); details of the data are given in Tables S2 and S3 of the Electronic supplementary material. These are among the largest and best-resolved data available, and while still subject to the many criticisms that food web data have received (Cohen et al. 1993), the many robust patterns found in these methodologically heterogeneous data (Stouffer et al. 2007; Thompson et al. 2007; Williams and Martinez 2008) give confidence that these findings are not the result of consistent bias in the data.
Two resource distributions were considered, termed the “all-species resource distribution” and the “restricted resource distribution.” The “all-species resource distribution” is defined as the distribution of the number of resources of each species, including the basal species, which consume no resources. This model is constrained only by knowledge of S and L. The “restricted resource distribution” is defined as the distribution of the number of resources of only the consumer species. As such, it includes prior knowledge of the number of basal species B and does not attempt to predict the fraction of basal species. Similarly, two consumer distributions are considered, the “all-species consumer distribution” and the “restricted consumer distribution.” The “all-species consumer distribution” is defined as the distribution of the number of consumers of each species, including the top species, which have no consumers. This model is constrained only by knowledge of S and L. The “restricted consumer distribution” is defined as the distribution of the number of consumers of the resource species, includes prior knowledge of the number of top species T, and does not attempt to predict the fraction of top species.
In the “all-species” distributions, the number of consumers or resources of each species can range from 0 to S and the mean number of links per species is L/S. In the “restricted” resource distribution, the number of links to each consumer can potentially range from 1 to S and the mean number of links to each consumer is L/(S − B). In the “restricted” consumer distribution, the number of links from each resource can potentially range from 1 to S and the mean number of links from each resource is L/(S − T). In general, the problem is to find a discrete distribution on a set of n values, here either {0,…,S} or {1,…,S} but more generally {x
1,…,x
n
}, with mean μ that maximizes \( H = - \sum\limits_i {p_i \;\ln \;p_i } \) subject to a set of constraints. This MaxEnt distribution is \( p_i = P\left( {X = x_i } \right) = Ce^{{\lambda x_i }} \) for i = 1,…,n. The constants C and λ are determined by the constraints that the probabilities sum to 1 and have mean μ (the number of links to or from each node): \( \sum\limits_i {p_i = 1} \) and \( \sum\limits_i {x_i p_i = \mu } \) (Jaynes 1957; Cover and Thomas 2006). The derivation, using Lagrange multipliers, is given in the Electronic supplementary material.
Finally, I developed a simple model of the undirected (sum of the number of consumer and resource links) distributions by assuming that the number of consumers and resources of each node are independent. Top species have no consumers, so for T species, the number of links is drawn from the MaxEnt resource distribution. Similarly, for B species, the number of links is drawn from the MaxEnt consumer distribution. For the remaining S–B–T intermediate species, the number of links is the sum of numbers drawn from the consumer and resource distributions.
The consumer, resource, and undirected distributions of the 51 empirical food webs were compared to the MaxEnt distributions derived using the empirical values of S, L, B, and T. Two tests of the fit of the MaxEnt models to the empirical data were used. In the first, the likelihood ratio (G) statistic (Sokal and Rohlf 1995) is used to compare an observed distribution to some expected (model) distribution. G is defined as \( G = 2\sum\limits_i {O_i \ln \left( {{{O_i } \mathord{\left/ {\vphantom {{O_i } {E_i }}} \right. } {E_i }}} \right)} \) where O
i
is the observed frequency, E
i
the expected frequency, and i indexes through all values in the discrete distribution with nonzero expected value. A randomization procedure is used; for each of the 10,000 trials, a sample is drawn from the MaxEnt distribution and its G value is compared to the G value of the empirical distribution where, in both cases, the expected distribution is the MaxEnt distribution. The goodness of fit, f
G, is measured by the fraction of trials in which the G value of the empirical distribution is greater than the G value of the distribution drawn from the MaxEnt distribution. The empirical distribution is considered to be significantly different from the MaxEnt distribution if f
G > 0.95.
The goodness of fit, f
G, does not differentiate between webs with overly broad or narrow degree distributions, a range of variation found in an earlier study of food web degree distributions (Dunne et al. 2002). To measure whether the empirical webs were more broadly or narrowly distributed than the model distributions, I measured the relative width of a distribution \( W = \log \left( {{{\sigma_{\text{O}} } \mathord{\left/ {\vphantom {{\sigma_{\text{O}} } {\sigma_{\text{M}} }}} \right. } {\sigma_{\text{M}} }}} \right) \) where σ
O is the standard deviation of the observed distribution and σ
M is the standard deviation of the model distribution. For each empirical web, the distribution of W for 10,000 webs drawn from the model distribution was computed. The quantity W
95 is defined as the deviation of the empirical value of W from the model median normalized by the width of the upper or lower half of the central interval of the model distribution of W at the 95% significance level. This gives the normalized difference in standard deviations of the empirical distribution relative to the median standard deviation of a set of samples drawn from the model distribution and so measures the relative width of the empirical distribution. Webs with W
95 < −1 have distributions that are significantly narrower than the model distributions; W
95 > 1 occurs for distributions significantly broader than the model distributions.