Phylogenetic Prediction to Identify “Evolutionary Singularities”
Understanding adaptive patterns is especially difficult in the case of “evolutionary singularities,” i.e., traits that evolved in only one lineage in the clade of interest. New methods are needed to integrate our understanding of general phenotypic correlations and convergence within a clade when examining a single lineage in that clade. Here, we develop and apply a new method to investigate change along a single branch of an evolutionary tree; this method can be applied to any branch on a phylogeny, typically focusing on an a priori hypothesis for “exceptional evolution” along particular branches, for example in humans relative to other primates. Specifically, we use phylogenetic methods to predict trait values for a tip on the phylogeny based on a statistical (regression) model, phylogenetic signal (λ), and evolutionary relationships among species in the clade. We can then evaluate whether the observed value departs from the predicted value. We provide two worked examples in human evolution using original R scripts that implement this concept in a Bayesian framework. We also provide simulations that investigate the statistical validity of the approach. While multiple approaches can and should be used to investigate singularities in an evolutionary context—including studies of the rate of phenotypic change along a branch—our Bayesian approach provides a way to place confidence on the predicted values in light of uncertainty about the underlying evolutionary and statistical parameters.
KeywordsMarkov Chain Monte Carlo Target Species Phylogenetic Signal Posterior Probability Distribution Model Selection Procedure
Convergence is fundamental to the comparative approach to testing adaptive hypotheses in biology. In short, we can be more confident that a trait is an adaptation if it has evolved repeatedly—rather than once—in association with another trait, environment, or other factors (Pagel 1994). Phylogeny is essential to this endeavor because an evolutionary tree provides the scaffolding upon which to identify evolutionary origins of traits and their covarying factors. Thus, phylogenetic methods are widely used to identify correlated trait evolution, to probe the factors that drive speciation and extinction, and to estimate rates of evolutionary change (Harvey and Pagel 1991; Garland et al. 2005; Nee 2006; Maddison et al. 2007; Martins 1994; Nunn 2011).
The convergence approach has proved incredibly powerful, yet this approach is not appropriate for investigating evolutionary singularities—i.e., traits that evolved in only one lineage in the clade of interest. Such a trait may appear in a single taxon on the tree, and thus, the evolutionary event occurred on the branch leading to that taxon (autapomorphy), or the singularity may occur on an internal branch, leading to its representation in all species in a subclade of multiple species (a synapomorphy). As with many concepts in cladistics, identifying singularities depends on the taxonomic level under consideration. Thus, we might say that “winged flight” is an evolutionary singularity in mammals (i.e., bats), but not among vertebrates more broadly (i.e., bats and birds).
Evolutionary singularities are especially relevant in studies of human evolution. Indeed, humans are unusual mammals with a suite of “zoologically unprecedented capacities” (Tooby and DeVore 1987, p. 183), such as language, walking with a striding gait, and wearing clothing. Humans possess complex cultural traits that build on other cultural traits and thus exhibit cumulative cultural evolution (Tennie et al. 2009). In terms of quantitative traits, humans have relatively large brains and exhibit longer periods of parental care than are found in other primates of our body mass. These traits can be examined in broad phylogenetic context using the comparative method (Barton 1996; Dunbar 1993; Deaner et al. 2000). Evolutionary anthropologists are interested in identifying the characteristics of humans that make us unique relative to other primates (Martin 2002; Kappeler and Silk 2009; Rodseth et al. 1991). Yet the novelty of our traits makes it challenging—some might say impossible—to quantitatively investigate the factors that influenced their evolution using convergence-based comparative approaches.
Considering quantitative characters such as brain size or body mass, two related approaches can be used to investigate evolutionary singularities. One approach “predicts” trait values for tips of the tree and quantifies deviations from this prediction (Garland and Ives 2000; Nunn 2011; Organ et al. 2011). The other approach estimates rates of evolutionary change along the branch of interest and then compares this rate to other branches on the tree (O’Meara et al. 2006; Revell 2008). We focus on the first of these approaches, which we call “phylogenetic prediction.”
When using the term phylogenetic prediction, we are specifically referring to predicting trait values on a tip of a tree, in contrast to the occasional use of the phrase “predicting phylogeny” to infer phylogenetic relationships. Just as it is useful to reconstruct traits at internal nodes on a phylogeny, predictions for values of traits on the tips of the tree are valuable for evolutionary research (Garland and Ives 2000; Nunn 2011). For example, predicting values on the tips of the tree can be used to estimate trait values in unmeasured species or for studying species that are too rare and endangered for handling or invasive sampling.
Here, we focus on using phylogenetic prediction to assess whether a species differs from what is expected based on both phylogeny and trait correlations. We might ask, for example, do humans have a later age at first reproduction, relative to other primates and incorporating the body mass scaling of primate life history traits? Prediction would be based on the association between body mass and age at first reproduction and—assuming phylogenetic signal in the traits or residuals from the statistical model (see Chap. 5)—our phylogenetic closeness to other apes. With a prediction in hand, we could then test whether the observed mean age at first reproduction in humans departs from expectations for other primates of our body mass, accounting for broader phylogenetic differences among the species in the sample. It is also possible to account for multiple predictor variables in the prediction, such as diet or predation risk.
In what follows, we consider a general framework for predicting trait values on the tips of the tree in a phylogenetic generalized least squares (PGLS) framework (Garland and Ives 2000; Organ et al. 2007, see Chaps. 5 and 6 for details about PGLS), and we review how this and related approaches have been used in previous comparative research. In the Online Practical Material (http://www.mpcm-evolution.org), we provide new R code and instructions to run the analyses using a Bayesian approach that also performs model selection (e.g., see Chap. 10) and controls for uncertainty in phylogenetic, evolutionary, and statistical parameters. The Online Practical Material also provides further statistical testing of the approach using simulations.
We apply our code—called BayesModelS, for “Bayesian Model Selection”—to two traits in humans in which we predict unique selection pressures relative to other anthropoid primates (monkeys and apes). First, we investigated the intermembral index (IMI). The IMI is calculated as 100 × forelimb length/hindlimb length; it has been used widely in studies of primate morphology because it covaries with categories of locomotor behavior involving vertical clinging and leaping (VCL), quadrupedal, or suspensory locomotion (Napier and Walker 1967; Martin 1990; Napier 1970). The IMI approximates 70 in primate species that exhibit VCL, 70–100 in those with quadrupedal locomotion, and 100–150 in primates that show suspensory locomotion (Martin 1990). Given this strong association, the IMI has been used to reconstruct locomotor behavior in the fossil record (e.g., Napier and Walker 1967; Martin 1990; Jungers 1978). With our highly derived (bipedal) locomotion, humans do not fall into any of these locomotor categories. In addition, our locomotion is associated with long legs and short arms—similar to species showing VCL and low IMI—but we evolved from suspensory species (all apes), which show the largest IMI values. Thus, if you had to bet on a trait as a singularity, the IMI in humans would be a good place to put your money; we test that prediction with our new computer code.
Second, we examine predictors of white blood cell counts in humans, both to re-investigate previous findings with our new methods (Nunn et al. 2000; Nunn 2002) and to test whether humans have exceptionally high numbers of circulating white blood cells. We specifically predicted that humans would have a larger number of neutrophils than predicted for a typical primate because humans have been exposed to a large number of parasites and pathogens through our close contact with domesticated animals and through agricultural practices (e.g., indirect contact with rodents that raid food stores, more sedentary lives, and formation of vector breeding grounds through irrigation, Barrett et al. 1998). Moreover, cooking likely provided additional energy for humans (Wrangham 2009), potentially increasing investment in immune defenses, which would be reflected by having higher numbers of circulating white blood cells.
21.1 Background: Phylogenetic Prediction
21.1.1 Relevant Literature
One of the first clear descriptions of phylogenetic prediction was provided by Garland and Ives (2000), where they argued for its utility in estimating traits in extinct or unmeasured extant species. They also showed how this approach can be used to identify deviations from allometric relationships, which is similar to our use here—i.e., they provide prediction intervals on a regression and test whether a species falls outside those intervals. Garland and Ives (2000) described how to conduct phylogenetic prediction with either independent contrasts or PGLS and provided approaches for placing confidence intervals on the predictions (see also Garland et al. 1999). In applying the method, Garland and Ives (2000) showed that phylogenetic information provides better predictions of trait values in unmeasured species, specifically by shifting the predicted interval to reflect phylogenetic propinquity to other species in the dataset and narrowing the interval compared to “generic” predictions that lack phylogenetic placement of the unmeasured species.
Organ et al. (2007) used phylogenetic prediction to investigate the evolution of genome size in birds, with a focus on extinct species. It is thought that smaller genomes reduce metabolic costs and are under selection for decreased size in birds due to the energetic expenditure of flight (Hughes and Hughes 1995). Organ et al. (2007) predicted genome size based on the size of osteocytes (bone cells). To do this, they used a Bayesian phylogenetic regression analysis implemented in the program BayesTraits (Pagel and Meade 2007). First, they confirmed an association between osteocyte size and genome size in living vertebrates. Next, they generated posterior probability distributions of genome size in 31 extinct dinosaurs, controlling for uncertainty in phylogeny and statistical parameters with their Bayesian approach. Remarkably, for all but one of the extinct theropods within the lineage that gave rise to birds, genome sizes fell within the range of variation found in living birds. Their analyses therefore suggest that the evolution of reduced genome size occurred before the evolution of flight. Thus, evolutionary correlations can be used to make phylogenetically informed predictions about traits that do not fossilize, based on both phylogeny and statistical associations with other features that do fossilize (see also Organ and Shedlock 2009).
This approach has also been used to investigate evolutionary singularities in primate evolution, focusing on human dental characteristics and feeding behavior (Organ et al. 2011). Cooking food is a key behavior that has influenced many aspects of human evolution and is unique to humans (Wrangham 2009), but has this behavior influenced quantitative aspects of our feeding behavior and morphology? Again using BayesTraits, analyses revealed that food processing had major impacts on the amount of time humans spend feeding: Our phylogenetic model predicted that we should feed for 48 % of our daily activity budget if we were a typical primate with our body mass, as compared to an extremely low observed value of 4.7 % in humans. In addition, Organ et al. (2011) found that cooking influenced dental morphology, providing a way to pinpoint the timing of transitions to cooking (and other sophisticated food processing) in human evolution. With a Bayesian phylogeny of hominins, they found evidence for a reduction in molar size in Homo erectus, with the morphological change suggesting that this hominin species had already adopted significant food processing behavior well before the emergence of modern humans.
Brain evolution is also a topic of great interest in the context of human evolutionary novelty. The human brain is thought to be central to many important aspects of human uniqueness, especially in terms of our cognitive abilities and social learning (Reader and Laland 2002; Deaner et al. 2007), but it remains unclear which parts of the brain are most important for understanding human uniqueness (Sherwood et al. 2012). At a gross level, many quantitative comparative approaches have been taken to assess brain evolution on the lineage leading to Homo, which clearly involved rapid, large, and unique changes (Lieberman 2011; Allman and Martin 2000; Sherwood et al. 2008; Martin 1990). In one recent study, for example, Barton and Venditti (2013) investigated whether human frontal lobes are exceptionally large relative to other brain regions in primates. Remarkably, they found no evidence for such effects and also failed to find evidence of elevated evolutionary change in prefrontal white and gray matter (relative to other brain areas) along the human lineage (examining variation in evolutionary rates is the other approach to investigating evolutionary novelty, noted above).
A somewhat different approach to understanding hominin brain evolution was taken by Pagel (2002). In a study of how brain size changed over time across a phylogeny of fossil hominins, he showed how branch-length scaling parameters can be used to investigate the tempo and mode of evolution (e.g., in terms of acceleration of brain size in human evolution). The intercept from his regression model served as a prediction of brain size in the ancestral node, deep in the human lineage. Like the previous example, Pagel’s approach is also more closely tied to estimating rates of evolutionary change. A variety of methods have been developed in this regard, with some based on independent contrasts (McPeek 1995) and others based on detecting variation in rates using maximum-likelihood approaches (O’Meara et al. 2006; Revell 2008).
Phylogenetic prediction is not strictly limited to testing for exceptional evolution. Using the method of Garland and Ives (2000), for example, Fagan et al. (2013) predicted measures of maximum population growth rate in poorly known mammalian species based on life history characteristics. Using cross-validation procedures, they found good agreement between observed and predicted values.
A different set of methods was used to investigate extinction risk in carnivores (Safi and Pettorelli 2010). In this study, the authors modeled threat status in a clade of 192 carnivore species based on phylogeny, geography, and environmental variables, with one goal to assess how well predictions matched empirical threat levels. Using phylogenetic eigenvector regression (Diniz-Filho et al. 1998) and spatial eigenvector filtering (Diniz-Filho and Bini 2005), they found that geography and phylogeny are important predictors of threat status and probably of other biological characteristics. Thus, it is important to include phylogeny and geography in predictive models of extinction risk.
The issue of evolutionary singularities and phylogenetic prediction was discussed in Nunn (2011). The present chapter expands the discussion of phylogenetic prediction and provides code to implement some of the proposals in Nunn (2011).
21.1.2 Implementation of Phylogenetic Prediction
The first well-described implementation of phylogenetic prediction was provided by Garland and Ives (2000). Using the software PDTREE and two traits, the authors described how to re-root the tree such that the target species and its sister occur at the base of the tree. The user then makes predictions based on the value of X in the target species and the branch length connecting it to the rest of the tree. The authors also provide equations for implementing phylogenetic prediction in a PGLS framework.
As noted above, BayesTraits has also been used to predict trait values based on a regression model (Organ et al. 2007, 2011). The program works well for multiple predictor variables, it can analyze results across a block of trees to account for phylogenetic uncertainty, and it is possible to estimate three different parameters that scale the phylogeny to reflect the degree of phylogenetic signal (see Chap. 5 and below). One issue, however, is that BayesTraits is incompletely documented (prediction is not mentioned in the manual, yet it can be accomplished), and it lacks the flexibility of running analyses within a statistical package that allows programming, such as R. With R code, for example, users can more flexibly automate and adjust the analyses—e.g., combine them easily with other data transformation and statistical procedures of interest—and it is easier to test the assumptions of the methods. An additional issue is that BayesTraits does not provide an easy-to-use model selection procedure that is integrated with parameter estimation. While one could run all possible models and select among them based on likelihood or Bayesian approaches, that would be extremely time-consuming. Based on these issues, we re-implemented and extended much of the functionality of BayesTraits in R.
21.1.3 General Approach
Our phylogenetic prediction approach involves three steps. First, we build a regression model to describe how a set of independent variables predicts a response variable, where the response variable is the target trait and the target species is not included in the analysis. Second, we use the resulting regression model to predict values of the target trait in the target species, based on measured values of the independent variables for the target species and its phylogenetic position relative to the other species in the dataset. Finally, we compare the predicted value of the target trait to the actual value in the target species. The larger this difference, the more “exceptional” the target trait is in the target species relative to other species in the clade, given the statistical model and phylogeny.
We wish to emphasize that phylogenetic prediction is not simply ancestral state reconstruction, as might be used for inferring the states of interior nodes on the tree. Our approach makes use of a phylogeny, an evolutionary model, and output from a regression analysis, whereas trait reconstruction uses only a phylogeny and an evolutionary model (or the assumption of parsimony). The regression analysis incorporates phylogenetic information, typically through a variance–covariance matrix in PGLS (see Chap. 5). Importantly, we exclude the target species when estimating parameters of the PGLS model. The prediction is based on this model, the variance–covariance matrix that includes the target species and predictor variables for the target species. If the statistical model has no predictors (or if the predictors fail to account for variation in the regression model), the approach is similar to standard ancestral state reconstruction, with only phylogeny, an estimated intercept, and the underlying evolutionary model providing predictions for the target trait in the target species (i.e., the estimate is made based on deviations from the estimated intercept). If there is no phylogenetic signal in the model, predictions are based solely on the statistical model, as might occur for predictions from a standard least-squares regression.
As noted above, a variety of statistical approaches can be used to assess whether the predicted value of the target trait differs from expectations in the target species. One could simply examine the magnitude of the difference, although this would not provide a way to judge the “significance” of the difference. It is also possible to use likelihood-based methods, for example through a likelihood ratio test. From such a test, a p-value can be obtained, which gives the probability of obtaining a difference as extreme or more extreme, assuming the null hypothesis of no difference is true. Finally, the user can (and should) calculate prediction intervals on the predicted value, again incorporating phylogeny (for equations, see Garland et al. 1999; Garland and Ives 2000).
Here, we use Bayesian framework to assess departures from predictions. Specifically, our method generates a posterior probability distribution for the target trait in a target species—such as a predicted value for body mass in Homo sapiens. While many approaches could be taken, including the frequentist approaches just described, Bayesian approaches are particularly appropriate for phylogenetic prediction. First, they provide a quantitative measure of the degree of difference, measured as the proportion of posterior predictions that are more or less extreme than observed. In addition, the Bayesian framework takes into account uncertainty in other parameters—including uncertainty associated with phylogenetic topology, branch lengths, and estimation of regression parameters and phylogenetic signal (Pagel and Lutzoni 2002). Finally, model selection procedures can be easily implemented within the Bayesian framework and by doing so effectively consider uncertainty in the model selection procedure itself. For Bayesian approaches see Chap. 10.
21.1.4 BayesModelS: R Scripts for Model Fitting, Model Selection, and Prediction
With these needs in mind, we developed new R scripts to conduct phylogenetic prediction in a Bayesian framework. BayesModelS uses a Bayesian Markov Chain Monte Carlo (MCMC) approach to obtain posterior probabilities of regression coefficients and phylogenetic scaling parameters (λ and κ) across a set of trees. The parameter λ is generally viewed as a measure of phylogenetic signal (Freckleton et al. 2002). It scales the off-diagonal elements of the variance–covariance matrix by λ, with λ = 0 equivalent to no phylogenetic signal because the internal branches collapse to zero length, resulting in a star phylogeny (Felsenstein 1985). The parameter κ raises branch lengths to the exponent κ (Pagel 1997, 2002). Thus, when κ = 0, all branches are set to be equal, which is consistent with a speciational model of evolution in which change occurs during the process of speciation (Garland et al. 1993), assuming no extinction on the tree. However, we view these parameters as ways to scale the branches to best meet the assumptions of the regression model (especially homoskedasticity), rather than being informative of the underlying evolutionary process.
Our script implements model selection procedures using Bayesian approaches by updating a vector indicating whether variables are included (1) or excluded (0) in the model at steps in the Markov chain and estimating coefficients for those that are included. The specific details of this procedure are provided in the Appendix. When a “missing” species is identified, it is considered to be a target species. Another function in our script then generates a posterior probability distribution of predicted trait values for that species (provided predictor variables are given for the target species).
BayesModelS takes two files as input: one that contains one or more phylogenies with branch lengths (in “phylo” format, Paradis et al. 2004) and the other serving as a data file. The data file should include headers that indicate variable names horizontally along the top, with species names in the first column that correspond to species in the phylogeny file. The code compares species names in the data and tree files to identify mismatches or missing species and reports those discrepancies to the user. The user specifies the statistical model to be evaluated based on column names, including the possibility to fix a variable in the model so that it is always included (i.e., permanently set to “1” in the vector associated with model selection). The user can also estimate scaling parameters λ and κ or allow the MCMC procedure to select among them. Although it is possible to include λ and κ in the model selection procedure, we recommend also running analyses in which only one of these scaling parameters is estimated, to assess whether similar estimates are obtained. In addition, λ and κ can be fixed to particular values (most commonly 0 or 1). The user also defines a burnin period and sampling (thinning) rate.
Output includes a summary of the data and phylogeny used, a detailed log of parameters and likelihoods in the sampled MCMC iterations, and graphical output of parameter estimates. For phylogenetic prediction, the method produces graphical and quantitative output for assessing whether a target species departs from predictions. Likelihood of the data for sampled models is recorded to provide a way to assess whether burnin was reached (i.e., whether the MCMC chain reached a stable distribution that can be sampled) and whether the thinning rate is sufficient (i.e., with low correlation between adjacent saved states in the chain).
More details are provided in the Appendix and the Electronic Online Material that accompany this book, including mathematical details on how the approach was implemented, simulation tests to assess the performance of the method, R code, data, and instructions for running the functions. The worked examples that follow also demonstrate the utility of the output for assessing performance of the MCMC analysis (e.g., ensuring adequate burnin and using the post-burnin samples).
21.2 Empirical Applications: How Do Humans Differ from Other Primates?
Bipedal Locomotion and the Intermembral Index . We investigated whether the intermembral index (IMI) in humans differs from what one would predict in a primate of our body mass. As described above, we predict that humans have a small IMI compared to other primates because bipedal locomotion has resulted in longer legs relative to arms and thus a lower IMI (see above and Nunn 2011). We tested this prediction using data on 117 primate species and a block of 100 trees sampled from a Bayesian posterior probability distribution provided in version 3 of 10kTrees (Arnold et al. 2010).
BayesModelS can take multiple predictor variables. Here, as a first illustration of the method, we used a simple model with only body mass as a predictor of IMI. We set the burnin to 100 iterations and sampled the chain every 100 iterations thereafter for 200,000 iterations, resulting in 2,000 samples for the posterior distribution of all parameters and the prediction for humans. We estimated λ as a measure of phylogenetic signal by setting the argument varSelection (i.e., varSelection = “lambda” ). Analyses were repeated three times to ensure that they stabilized on the same parameter space, and we visually assessed whether likelihoods reached a steady state with adequate sampling. Data were log10-transformed prior to analysis. Humans were identified as “missing” in the analysis through the argument “missingList” in BayesModelS (i.e., missingList = c(“Homo_sapiens”) ). The species supplied to missingList are excluded from the first step of estimating parameters in the model (along with all species for which a predictor or response variables are missing in the proposed regression model). Thus, predictions that come from the model are not biased by extreme values in the target species.
Primate White Blood Cell Counts. As a second example, we examined primate white blood cell (WBC) counts, focusing on the most common of the WBCs, neutrophils, which are involved in innate immune responses. For reasons outlined above, we predicted that humans would be evolutionary outliers in white blood cell counts. In addition to testing an interesting hypothesis in human evolution, we chose this example because two of the variables are discrete traits with three ordinal levels: Female promiscuity was coded as a three-level variable (monogamous, usually one male but not always, and typically more than one male, from van Schaik et al. 1999); terrestriality was also coded on a three-part scale reflecting typically arboreal, typically terrestrial in wooded environments, and typically terrestrial in an open environment, where the last category is associated with the greatest terrestrial substrate use (from Nunn and van Schaik 2002). We also used the model to show how λ or κ models of trait evolution can be selected using MCMC during estimation of regression coefficients. All data were log10-transformed prior to analysis.
The underlying regression model re-examines previous findings (Nunn et al. 2000; Nunn 2002). Following these studies, we predicted that neutrophil counts increase with promiscuity (to reduce risk of STD transmission), body mass (to reduce risk of dietary transmission, with larger animals eating more resources), terrestrial substrate use (to reduce risk of fecally transmitted parasites on the ground), and group size (to reduce risk of social transmission in larger groups). We also included body mass because body mass covaries with some of our variables and larger-bodied primates have been found to have higher WBC counts in previous comparative analyses (Nunn et al. 2009; Cooper et al. 2012). Previous research found support for the effects of promiscuity, but not group size or substrate use (Nunn et al. 2000; Nunn 2002).
21.2.1 Future Directions and Conclusions
Evolutionary novelties pose a serious challenge for the comparative method. Biologists would like to study these traits in broad comparative perspective, but how can this be achieved without falling into the trap of adaptive storytelling? How can we investigate evolutionary singularities—or cases of “exceptional evolution” for quantitative traits—in a statistically rigorous way? We propose that phylogenetic prediction offers a valuable solution for this challenge, especially when combined with other methods, such as comparing rates of evolutionary change in different lineages (O’Meara et al. 2006; Revell et al. 2008). In particular, the underlying statistical model is based on widely accepted approaches to investigating adaptive evolution using phylogenetic comparative methods. When this model is applied to a single lineage in a phylogenetic context, it gives fresh insights into whether the general pattern of adaptive evolution is also explanatory in the “target species” of interest.
The importance of phylogenetic prediction is underappreciated in studies of the evolutionary process, especially when considering whether a particular species departs from the overall evolutionary pattern in a group of organisms. The approach has yet to be widely used, due in part to lack of good implementation in R, which is becoming established as the standard for comparative analyses. In this chapter, we aimed to overcome these limitations through new R scripts (BayesModelS), original analyses, and supplementary datasets that enable others to run our analyses. We focused especially on the context of human evolution. However, we expect that many other biological systems provide similar examples of evolutionary singularities for which this perspective—and versions of our code—would be useful. It is worth noting that our BayesModelS code can also be run without the prediction component to provide Bayesian PGLS with model selection.
In terms of future directions, it would be desirable to further assess the statistical performance of BayesModelS to detect differences (some simulations are provided in the Appendix). By simulating evolutionary change under known conditions and varying the number of species, it is possible to investigate Type I error rates (when simulation along a lineage uses the same model as in other species) and Type II error rates (i.e., statistical power, when higher rates of evolution and/or directional evolution occur on the target species’ lineage, corresponding to greater evolutionary change). By using a particular phylogeny in the simulation—e.g., primates if the question involves humans as the target species—one could estimate statistical properties in a specific biological context of a hypothesized singularity in a particular lineage. It will also be important to investigate whether predictive capability declines when more extreme values of predictor variables are used, especially if those extreme values are more likely to result in singularities through nonlinear or threshold effects or if they involve extrapolation beyond available data.
As noted throughout, one can also investigate exceptional evolution in terms of variable rates of evolution: We expect elevated rates of evolution in the target trait on the lineage leading to the target species. In a previous application of this general approach to study human feeding time, for example, Organ et al. (2011) showed that the rate of evolution was substantially elevated in the lineage leading to humans. Treating branch length as a measure of evolutionary rate, the branch leading to humans would need to be 50 times longer to accommodate the large reduction in molar size in early Homo (under a Brownian motion model and based on changes in body mass). In another example using evolutionary rate variation, Nunn (2011) applied a method (McPeek 1995) based on independent contrasts to study the IMI. As expected, this analysis revealed that change on the branch leading to modern humans is significantly elevated, as compared to other contrasts among the apes. Thus, in addition to computer code provided here for phylogenetic prediction, it would be valuable to develop user-friendly code to implement a wide range of methods, such as McPeek’s (1995) method, or to study variable rates of evolution in the context of singularities using existing code, such as Brownie (O’Meara et al. 2006) and related code in the phytools package in R (Revell 2011).
We thank Luke Matthews, Tirthankar Dasgupta, László Zsolt Garamszegi, and two anonymous referees for helpful discussion and feedback. Joel Bray helped format the manuscript. This research was supported by the NSF (BCS-0923791 and BCS-1355902).
- Allman JM, Martin B (2000) Evolving brains. Scientific American Library, Nueva YorkGoogle Scholar
- Cooper N, Kamilar JM, Nunn CL (2012) Longevity and parasite species richness in mammals. PLoS OneGoogle Scholar
- Garland T, Midford PE, Ives AR (1999) An introduction to phylogenetically based statistical methods, with a new method for confidence intervals on ancestral values. Am Zool 39:374–388Google Scholar
- Gelman A (2004) Bayesian Data Analysis. Chapman & Hall/CRC, London/Boca RatonGoogle Scholar
- Harvey PH, Pagel MD (1991) The comparative method in evolutionary biology., Oxford Series in Ecology and EvolutionOxford University Press, OxfordGoogle Scholar
- Kappeler PM, Silk JB (eds) (2009) Mind the gap: tracing the origins of human universals. Springer, BerlinGoogle Scholar
- Lieberman D (2011) The evolution of the human head. Belknap Press, CambridgeGoogle Scholar
- Liu J (2003) Monte Carlo strategies in scientific computing. Springer, BerlinGoogle Scholar
- Martin RD (1990) Primate origins and evolution. Chapman and Hall, LondonGoogle Scholar
- Napier JR (1970) The roots of mankind. Smithsonian Institution Press, WashingtonGoogle Scholar
- Orme D, Freckleton R, Thomas G, Petzoldt T, Fritz S, Isaac N (2011) Caper: comparative analyses of phylogenetics and evolution in R. http://R-Forge.R-project.org/projects/caper/
- Pagel M, Meade A (2007) Bayes traits (http://www.evolution.rdg.ac.uk). 1.0 edn., Reading, UK
- Pagel MD (1994) The adaptationist wager. In: Eggleton P, Vane-Wright RI (eds) Phylogenetics and Ecology. Academic, London, pp 29–51Google Scholar
- Revell LJ (2011) Phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol EvolGoogle Scholar
- Tooby J, DeVore I (1987) The reconstruction of hominid behavioral evolution through strategic modeling. In: Kinzey WG (ed) The evolution of human behavior: primate models. State University of New York Press, Albany, pp 183–237Google Scholar
- Wrangham RW (2009) Catching fire: how cooking made us human. Basic Books, New YorkGoogle Scholar