Human Nature

, Volume 23, Issue 3, pp 283–305

Cultural Macroevolution on Neighbor Graphs

Vertical and Horizontal Transmission among Western North American Indian Societies
  • Mary C. Towner
  • Mark N. Grote
  • Jay Venti
  • Monique Borgerhoff Mulder

DOI: 10.1007/s12110-012-9142-z

Cite this article as:
Towner, M.C., Grote, M.N., Venti, J. et al. Hum Nat (2012) 23: 283. doi:10.1007/s12110-012-9142-z


What are the driving forces of cultural macroevolution, the evolution of cultural traits that characterize societies or populations? This question has engaged anthropologists for more than a century, with little consensus regarding the answer. We develop and fit autologistic models, built upon both spatial and linguistic neighbor graphs, for 44 cultural traits of 172 societies in the Western North American Indian (WNAI) database. For each trait, we compare models including or excluding one or both neighbor graphs, and for the majority of traits we find strong evidence in favor of a model which uses both spatial and linguistic neighbors to predict a trait’s distribution. Our results run counter to the assertion that cultural trait distributions can be explained largely by the transmission of traits from parent to daughter populations and are thus best analyzed with phylogenies. In contrast, we show that vertical and horizontal transmission pathways can be incorporated in a single model, that both transmission modes may indeed operate on the same trait, and that for most traits in the WNAI database, accounting for only one mode of transmission would result in a loss of information.


American Indians Cultural evolution Cultural transmission Cultural traits Cross-cultural variation Autologistic models Neighbor graphs Model comparison 

More than 90 years ago in a paper read at the AAAS meetings, Franz Boas detailed what he called “the most difficult problem of anthropology”: namely, explaining the origins of similarities in culture found across human societies (Boas 1896). Understanding such similarities was and is key to comprehending how human culture evolves. Boas stressed that similarities such as the use of totems, the wearing of ceremonial masks, and even the structure of families could result from several processes: shared ancestry, diffusion or borrowing, and independent innovation. Alfred Kroeber, a student of Boas, reasoned that whereas biological evolution could be represented by an ever-branching tree, cultural evolution, through the process of diffusion in particular, was better represented by a tree with intertwined branches (Kroeber 1948).

Even now, lively debate continues over the ramifications such entanglements have for cultural macroevolution, the mechanisms whereby cultural traits characterize societies and populations across space and over time (Borgerhoff Mulder et al. 2006; Boyd et al. 1997; Collard et al. 2006; Durham 1992; Hewlett et al. 2002; Holden and Shennan 2005; Jordan and Shennan 2003; Moore 1994; Rogers and Ehrlich 2008). Evolutionary anthropologists typically invoke two models of cultural macroevolution: phylogenesis and ethnogenesis (Collard and Shennan 2000). Phylogenesis assumes that cultural traits are inherited vertically at a societal level, with trait variants passing from ancestral to descendant populations. Ethnogenesis assumes a predominance of horizontal transmission, such that cultural traits are strongly influenced by the borrowing and blending of traits between populations, with societies or populations borrowing through copying, teaching, imitation, or imposition (Durham 1990, 1992). As first noted by Kroeber, in place of a clear branching pattern this process produces a reticulate pattern of cultural inheritance among different societies.

Many anthropologists and archaeologists subscribing to ethnogenesis are concerned that the signature of historical origins within cultural assemblages is swamped by rapid rates of horizontal transmission. This concern is in most cases unwarranted. Recent work demonstrates that craft styles and technology are quite faithfully transmitted across generations, as in the case of weaving (where the principal weavers, the women, rarely meet weavers from other ethnic groups because of endogamous marriage customs; Tehrani and Collard 2009); furthermore population histories (as measured by language) and cultural phylogenies are often quite closely correlated (Gray and Jordan 2000; Holden 2002; Jordan and O’Neill 2010), as reviewed by Mace and Holden (2005). The fit however is rarely perfect, and sometimes it is quite poor. For example there is clear evidence of blending and borrowing of cultural traits among indigenous communities in North America (Jordan and Mace 2006; Jordan and Shennan 2003). Similarly the intricate designs on Lapita pottery indicate considerable borrowing across Pacific islands (Cochrane and Lipo 2010), and certain features of Iranian carpets show strong incongruence with the written and oral traditions of the tribes’ origins (Tehrani and Collard 2009) and indeed a different phylogenetic history (Matthews et al. 2011).

Clearly, both phylogenesis and ethnogenesis likely contribute to cultural macroevolution (e.g., Borgerhoff Mulder et al. 2006; Gray and Jordan 2000; Mesoudi and O’Brien 2009). When, where, and to what extent traits are transmitted horizontally or vertically will depend on a variety of features of the trait, including its functional and symbolic relationship with other traits in the population (Boyd et al. 1997), and the extent to which the trait depends on coordination (Fortunato and Jordan 2010; Jordan and O’Neill 2010) or other transmission-coupling mechanisms such as conformism (Boyd and Richerson 2010). There are other considerations: highly functional traits have long been expected to show particular tendencies to be adopted, often rapidly, by unrelated populations (Dunnell 1978), as indeed supported in Gray et al.’s (2010) study of the functional design of canoes in the Pacific. That said, the prediction that phylogenetic signature in language should be stronger in a recently settled archipelago, such as in the Pacific, than in an ancient set of interconnected populations, such as in Eurasia, turns out to be false—the Polynesian languages show a less treelike evolutionary pattern than do the Indo-European languages (Gray et al. 2010). Even language, a cultural trait commonly held to characterize the history of populations, consists of distinct traits (basic lexicon, typological structures, etc.) that show very different histories, only some of which are relatively treelike (Gray et al. 2010).

In fact, there is still no clear guiding theory regarding the prevalence of vertical and horizontal transmission at cultural macroevolutionary scales. Attention has focused more on the methodological challenge of detecting horizontal transmission in cross-cultural datasets (Gray et al. 2008; Shennan 2009; Steele et al. 2010), a challenge increasingly faced in biology where horizontal gene transfer is proving to be extremely common (Dagan and Martin 2007). Such initiatives include methods that quantify the degree to which traits are consistent with a branching-tree model (Collard et al. 2006) and that evaluate the likelihood of any given tree (e.g., Huelsenbeck et al. 2000); recent work in this vein uses Bayesian tests of models in which traits show either different levels of hierarchical aggregation (Matthews et al. 2011) or different associations with geographic, historical (Freckleton and Jetz 2009), and ecological variables (Beheim and Bell 2011). A parallel statistical approach for detecting horizontal transmission uses simulations to identify traits that appear to deviate from a null hypothesis of pure vertical transmission (Franz and Nunn 2009). Drawing inspiration from studies of host-parasite coevolution, some investigators are exploring “co-phylogenetic” methods (Page and Charleston 1998) to identify incongruence among gene trees (Gray et al. 2008; Tehrani et al. 2010), whereas others are drawing from phylogenetic methods developed to take into account horizontal transfer of genes during microbial evolution (e.g., Nelson-Sathi et al. 2010).

Methods such as NeighborNet (Bryant and Moulton 2002) have also been developed that do not assume a tree but instead construct networks. Using geographical proximity as a proxy for horizontal transmission, this approach constructs split composition graphs for cultural datasets to detect conflicting signals consistent with horizontal transmission (e.g., Bryant et al. 2005). For instance, Lipo (2006) graphs projectile points from the southeastern United States to show that in the earlier period they can be ordered into a chronological sequence, whereas later on divergences appear. There are many other non-phylogenetic approaches. Archaeologists and others have explored the correlation between geographic and cultural distances among societies (Jordan and Shennan 2003; Moore and Romney 1994; Welsch et al. 1992; Smouse and Long 1992). Another approach is to use distance matrices—effectively to ask whether the distance between populations in their cultural traits is predictable in terms of their distance in genes, language, or geographic location (Guglielmino et al. 1995; Hewlett et al. 2002). Regression approaches are demonstrated in a series of papers by Dow and collaborators (Dow 1984, 2007); Dow and Eff (2008) calculate network autocorrelation for more than 1,150 variables coded for the Standard Cross-Cultural Sample with respect to five distinct measures of network proximity: distance, language, cultural complexity, religion, and ecological niche.

Clearly the possibility of horizontal transfer of cultural ideas and practices does not pose an intractable problem for the study of cultural macroevolution, and many potential approaches are on the table for use in archaeological and ethnographic exploration. Here we present a new methodological approach that will help anthropologists to answer the old and enduring anthropological question of the relative importance of horizontal and vertical transmission between populations. The primary merits of our approach lie in placing the analysis of spatial and linguistic signals in the patterning of cultural traits within a single framework (cf. Dow 2007 and Freckleton and Jetz 2009; see also Holden and Mace 1999), and in our use of information criteria to compare models (Matthews et al. 2011). Our approach also lends itself to graphical displays which facilitate trait-to-trait comparisons (see below).

To identify and depict the relative roles of vertical and horizontal transmission, we first embed cultural traits in neighbor graphs (Besag 1975) that capture both shared history and shared geography. We then estimate autologistic models (Besag 1974) that are able to incorporate both history and geography within a single framework (Table 1). Finally we use information criteria (Burnham and Anderson 2002) to evaluate the evidence for alternative models. We do this for a sample of cultural traits from the Western North American Indian (WNAI) database (Jorgensen 1980) in order to investigate the relative importance of shared ancestry and diffusion in shaping the current distribution of traits. Because there is usually scant direct evidence for past transmission events, we follow others in using linguistic and spatial neighbor graphs to model, respectively, shared ancestry and the potential for diffusion among societies (see “Discussion” for interpretational issues).
Table 1

Alternative models


Source of contact







No neighbor graphs






Linguistic graph only






Spatial graph only






Linguistic and spatial graphs




We also evaluate a series of predictions about the relative influence of transmission modes in different cultural domains (Moylan et al. 2006). Anthropologists have long argued that family and kinship are particularly conservative cultural domains, perhaps owing to the microevolutionary transmission of these traits from parents to offspring (Burton et al. 1996; Guglielmino et al. 1995; Jones 2003). Hence our domains “Marriage and Residence” and “Kinship and Family” are predicted to show high levels of vertical transmission. (Domains, traits, and trait descriptions are given in the Electronic Supplementary Material [ESM].) A related set of expectations lies in the intuitions of anthropologists who posit that cultural traits deeply embedded in social institutions are particularly conservative. Typical of these traits are core traditions such as social organization and hierarchy (Giddens 1984; Vansina 1990). Therefore our domain “Political Organization and Social Stratification” is expected to show vertical transmission. In addition, functionally neutral traits, and traits that are shielded from ecological influences, such as those in the domain “Rituals, Beliefs, and Attitudes,” are also predicted to show strong vertical transmission insofar as they are carried unchanged by ancestral populations across varying environments (Dunnell 1978; Neiman 1995; Rogers and Ehrlich 2008). In contrast, traits in the domains “Material Culture” and “Subsistence and Settlement” are more likely to show horizontal transmission, insofar as neighbors may borrow traits that are well adapted to shared ecological conditions (Borgerhoff Mulder et al. 2006).


The WNAI database (Jorgensen 1980) is a collection of cultural trait data from 172 societies in Western North America, spanning southern Alaska, western Canada, the western continental United States, and northern Mexico. This region is incredibly diverse in terms of both local environments and the peoples occupying them. From a vast number of environmental variables, Jorgensen (1980) identifies six environmental regions within the larger Western North American region (see ESM Figure 1). These environmental regions, along with a sample society from each, are the Northwest Coast (e.g., Tsimshian), Northern and Central California (e.g., Central Miwok), Southern California (e.g., Western Diegueño), the Plateau (e.g., Nez Perce), the Great Basin (e.g., Uintah Ute), and the Southwest (e.g., Hopi). These regions vary not only according to seasonal temperatures, rainfall, flora, and fauna, but also according to prominent geographical features. For example, the Pacific coastline and its many bays and tributaries exert heavy influence in several coastal regions, while further inland prominent geographical features include mountain ranges (e.g., Sierra Nevada), deserts (e.g., Mojave), and canyons (e.g., Grand Canyon).

As further described by Jorgensen (1980 and references therein), Western North America at least historically was characterized by an abundance of natural resources which supported high population densities, notably along the Pacific Coast. Here too is where language diversity was at its highest (Campbell 1997; Mithun 1999; Golla 2000). A combination of genetic, linguistic, and archaeological evidence suggests human populations have been occupying the regions for at least 12,000 years, with different waves of settling populations sometimes expanding within or replacing previous groups (see Powell 2005 for recent overview). Many of the societies in the sample had frequent contact with other societies, whether through trade, marriage, or raiding, and in some societies bilingualism was common. The primary mode of subsistence of all societies in the sample was extractive, through varied emphasis on hunting, fishing, and collecting. The more particular forms of subsistence of course varied with the dominant fauna (e.g., salmon and marine mammals in the Northwest vs. small mammals in the Southwest) and flora (e.g., conifer forests in the Northwest vs. oak forests in Southern California). Only in some societies in the Southwest region was farming significant, and even here extracted resources also contributed substantially to the diet.

The WNAI database itself was compiled by Joseph Jorgensen, Harold Driver, and others who drew from traditional ethnographies and the Cultural Element Distribution checklists that Alfred Kroeber had collected in the 1930s (Jorgensen 1980, 1999a). Kroeber’s instructions for the researchers completing the checklists were to interview the elders in the community and to ask about their grandparents’ generation. In compiling the WNAI database from these sources, Jorgensen et al. focused on societies for which there was sufficient ethnographic information available and that were considered unique from others in the sample (Jorgensen 1999b). The WNAI database includes more than 400 variables, from which we selected 44 cultural traits, focusing on those that could be meaningfully converted into binary variables, that were not highly skewed, and that had few missing cases. Traits are already coded categorically in the WNAI; we combined categories as needed to create appropriate binary traits (see ESM Table 2 for the list of traits with trait descriptions and ESM Table 3 for a list of populations by language group). We sorted our chosen traits into six different domains which represent broader categories of trait types (e.g., material culture; marriage and residence). These are patterned after Jorgensen (1980) and previous studies (Guglielmino et al. 1995; Hewlett et al. 2002; Moylan et al. 2006) to facilitate comparison.

Modeling Cultural Trait Distributions

We used R: A Language and Environment for Statistical Computing (R Development Core Team 2010) to complete all of the modeling steps described in this section, including constructing neighbor graphs, fitting autologistic models to cultural traits, and using information criteria to compare models.1

Neighbor Graphs

Autologistic models for binary variables (described below) incorporate local dependencies among observations through the use of neighbor graphs (Besag 1975). By the choice of criteria used to define neighbors, such graphs can be customized readily to a wide variety of study contexts. In the present case, two distinct neighbor graphs are of interest: a spatial neighbor graph, connecting pairs of societies living within a specified geographical distance of each other (Fig. 1a), and a linguistic neighbor graph, connecting pairs of societies sharing the same language group (Fig. 1b). Neighbors are defined as follows.
Fig. 1

Spatial and linguistic neighbor graphs for the trait patrilocal. The distribution of the trait is shown, with clear circles for societies with postnuptial residence forms focused on the husband’s male kin (e.g., patrilocal, avunculocal, and virilocal residence) and solid gray circles for societies without such residence forms (including neolocal, bilocal, and matrilocal societies). We use Rgraphviz (Gentry et al. 2010) to display spatial and linguistic neighbor graphs, selecting the “neato” layout option. (a) Gray lines connect spatial neighbors (societies separated by less than 175 km). The large central cluster contains societies located near San Francisco Bay and the Sacramento River Delta. Moving left from the central cluster, the societies are subsequently from the southern California coast, the Mojave Desert and the southern Great Basin. Moving right from the central cluster, the societies are from coastal and inland Oregon, Washington and British Columbia. (ESM Figure A1 shows the spatial neighbor graph in an alternate display using latitude/longitude coordinates.) (b) Gray lines connect linguistic neighbors (members of the same language group as defined in the text). The four large groups in the center (moving clockwise from the upper left) are Yuman-Cochimi, Numic, Apachean, and Coast Salish

We created language neighbor groups drawing from the relatively conservative classifications of Mithun (1999) and Campbell (1997). Because the majority of our language groups fall at the level of “subfamily” (see ESM), we avoid the more controversial deeper relationships proposed by Greenberg (1987). We base this decision on a recent description of the hierarchical structure of North American languages (Golla 2000) which, owing to differential scholarly effort, varies in the level of detail across groups; the current information is not complete enough for a more treelike organization of the WNAI languages.

Simple relational statements of the form “Societies A and B are in language group G” provide sufficient information to construct the linguistic neighbor graph in Fig. 1b; the geometrical forms appearing in the figure are directly implied by shared group membership. The average number of linguistic neighbors over all societies is 8.7, with a range of 0 to 23. The spatial neighbor graph (Fig. 1a) has approximately the same average number of neighbors (7.9) and range (0 to 24) as the linguistic neighbor graph. We achieved this approximate match by defining societies living within 175 km of each other (determined by longitude and latitude data available in the WNAI database) to be spatial neighbors. When these neighbor definitions are applied to the WNAI sample, 10 societies have no linguistic neighbors and 12 societies have no spatial neighbors; in no case, however, was a society both a linguistic and a spatial isolate. We view the matching of neighbor distributions as a calibration procedure, which approximately equalizes the number of available “lines of transmission” connecting a typical society with its neighbors in the two different graphs. The subject of our investigation—whether or not cultural traits appear to be differentially shared across linguistic and spatial lines of transmission—comes more readily into focus with calibrated neighbor graphs.

To examine the potential association between linguistic and spatial neighbor forms, we tally pairs of societies that are both spatial and linguistic neighbors, neighbors of only one type, or of neither type (Table 2). The counts sum to 14,706, the number of unique pairs among the 172 societies. The odds ratio for this table is 11.8, suggesting strong relationships between the two neighbor types. This calculation, however, is highly sensitive to the fact that the majority of pairs are neither spatial nor linguistic neighbors. (An odds ratio of 1 would indicate independence of neighbor types.) Calculating from Table 2, the fraction of societies that are spatial neighbors, given that they are linguistic neighbors, is 0.29; similarly the fraction that are linguistic neighbors, given they are spatial neighbors, is 0.32. Although neighbors of one type are relatively likely to be neighbors of the other type, dual relationships are not at all a certainty. We view the spatial and linguistic neighbor graphs as representing distinct, though overlapping, lines of transmission between societies.
Table 2

Linguistic and spatial neighbor pairs in the WNAI database


Spatial neighbors

Not spatial neighbors

Linguistic neighbors



Not linguistic neighbors



Autologistic Model

The autologistic model (Besag 1974) assumes that the trait value for each society depends probabilistically on the trait values of the society’s spatial and linguistic neighbors. The strength of this dependence is measured by distinct spatial and linguistic association parameters. Examples of the development and application of autologistic and related models include analyses of plant spatial distributions (Besag 1974), reconstructing pixelated images (Besag 1986), estimating genetic relatedness (Geyer and Thompson 1992), constructing species distribution maps (Hoeting et al. 2000), modeling the formation of tenant-farmer contracts (Young and Burke 2001), and predicting purchasing behavior based on consumer similarity (Moon and Russell 2008). In a simulation study, Dormann (2007) criticizes autologistic modeling that is now commonplace in spatial ecology, but the models in his scope are fitted by a computational shortcut (pseudo-likelihood). We employ a full likelihood method (described below) in the spirit of the original model and an early implementation by Geyer and Thompson (1992). Our version of the autologistic model is the first that we know of to incorporate two distinct neighbor graphs simultaneously.

For a given trait, let x = (x1,x2, …,xn) be the sample of binary trait values across societies, where for the ith society, xi takes the value −1 or +1 according to an arbitrary coding scheme (the direction of which has no bearing on results). Two societies that are spatial neighbors are called concordant for the trait if they have the same trait value (both −1, or both +1); otherwise they are called discordant. Concordant and discordant linguistic neighbors are defined analogously. Let S(x) be the number of unique spatial neighbor pairs concordant for the trait, minus the number discordant for the trait; let T(x) be defined analogously for linguistic neighbor pairs; and let U(x) = Σixi. Under the autologistic model, the sample likelihood is
$$ {\text{L}}\left( {\theta, { }\lambda, { }\beta; x} \right) = {{{{ \exp }\left\{ {\theta S(x) + \lambda T(x) + \beta U(x)} \right\}}} \left/ {{z\left( {\theta, { }\lambda, { }\beta } \right)}} \right.} $$
where θ, λ, and β are parameters to be estimated, and z(θ, λ, β) is a normalizing constant equal to \( {\Sigma_x}{ \exp }\left\{ {\theta S(x) + \lambda T(x) + \beta U(x)} \right\} \), the summation being over the set of all 2n binary arrays of length n. The distribution implied by equation 1 is an exponential family (Barndorff-Nielsen 1978) having association parameters θ and λ, level parameter β (broadly analogous to the intercept in logistic regression), and sufficient statistics S, T, and U.
Under the model, the log-odds that the ith society has trait value +1, given the trait values of all other societies, is
$$ {\text{logit}}\,pr\left( {{x_{\text{i}}} = + {1 }|{\text{ all other societies}}} \right) = {2}\left( {\beta { } + { }\theta {g_{\text{i}}} + { }\lambda {l_{\text{i}}}} \right) $$
where gi is the sum of the trait values of spatial neighbors of society i, and li is the analogous sum for linguistic neighbors. Equation (2) shows that the trait value for society i is, in effect, predicted by a logistic regression on the sums of the trait values of its spatial and linguistic neighbors. Thus global trait patterns are reducible to local trait dependencies between neighbors.

Motivating the Method with an Example

Figures 1a and b, respectively, show the spatial and linguistic neighbor graphs for the WNAI data, with societies (nodes of the graphs) coded by the example trait patrilocal. Consider the 10 societies of the Interior Salish language group (indicated by the label “IS” in Fig. 1b). All pairs of societies within this group are linguistic neighbors and are connected by edges of the linguistic neighbor graph. Suppose we want to predict the trait value for the society nearest to the label “IS”, as if this society’s trait were unknown. Under the autologistic model, the odds that this society is patrilocal depend on the difference in numbers of patrilocal and non-patrilocal societies among the focal society’s neighbors (eq. 2). In the present example, the focal society has six patrilocal and three non-patrilocal neighbors, giving a difference of three; the odds that the focal society is patrilocal are then proportional to e (with large, positive values of λ producing greater odds). In order to complete the prediction, a similar calculation using the focal society’s spatial neighbors is incorporated, along with a contribution from the level parameter β (eq. 2).

Broadly speaking, the autologistic model is fitted to a trait (such as patrilocal) by an algorithm which predicts the trait value for each society in turn, determining the most suitable parameters θ, λ, and β by combining results from many rounds of prediction. We examine four models for each trait: the full model (eq. 1), as well as three sub-models that essentially “turn off” the effect of spatial or linguistic neighbors or both by fixing one or both association parameters at zero (see Table 1). The level parameter β is retained in all four models.

Gibbs Sampler

Although equation (1) is explicit, calculating the normalizing constant z(θ, λ, β) requires the enumeration of 2n binary arrays, which is feasible only for small samples. We therefore adopt Markov-Chain Monte Carlo (MCMC) methods described by Geyer (1991, 1994, 1996) for numerical likelihood inference.

Equation (2) gives the “full conditional” distribution of the trait value for each society given all other societies; thus the Gibbs sampler (Geman and Geman 1984) is a natural choice for generating random realizations of trait values under models 1–4. Stated briefly, our implementation of the Gibbs sampler works as follows. For a given trait under model m = 1,…4, we use the conditional distribution (2), with parameters θm, λm, and βm, to generate a trait value at random for each society, given the trait values of all other societies. Each society in the sample is thus updated in turn, until all societies have been updated; this completes one scan of the Gibbs sampler for model m. Many realizations from the autologistic model with parameters θm, λm, βm are generated by repeated scans. The likelihood (eq. 1) is then approximated by importance-sampling calculations (Geyer 1994, 1996), which combine realizations generated under models 1–4 in a weighted fashion. We give implementation details in the ESM.

Our results are based on R = 100,000 Gibbs realizations for each trait (25,000 for each model); the realizations are obtained by thinning a total of 200,000 scans (50,000 for each model after discarding a burn-in of 1,000 scans) at an interval of two. We examined time-series graphs of the sufficient statistics S, T, and U over scans, to check that mixing was adequate and investigate convergence of the sampler. We apply the MCMC method independently to each of the 44 traits.

Model Comparison

Generally speaking, a simple comparison of θ and λ for a given trait is inadequate for assessing the relative importance of spatial and linguistic neighbors, as θ and λ must be interpreted in the context of their respective (and possibly very different) neighbor graphs. However, calibration of the neighbor graphs facilitates qualitative comparisons of θ and λ (Fig. 2). When the graphs are calibrated, a typical society is connected to its spatial and linguistic neighbors by similar numbers of edges: thus if the two types of neighbors have about the same influence on a trait, the magnitudes of the statistical signals from the graphs should be similar. Differential influences of spatial and linguistic neighbors should accordingly lead to differences between θ and λ. In order to determine more definitively the importance of spatial and linguistic neighbors for each trait, we compare the relative support for models 1–4 using information criteria (Burnham and Anderson 2002) (Fig. 3). Information criteria allow for the direct comparison of non-nested models (M3 vs. M2; see Table 1) and offer an alternative to conventional approaches focusing on questions of “statistical significance.” In the work to follow, inferences about the effects of spatial and linguistic neighbors will be made by competing models 1–4 against each other, rather than by testing whether θ and λ are individually or jointly different from zero.
Fig. 2

Estimated spatial and linguistic association parameters for 44 traits of the WNAI database. Scatterplot of estimated θ and λ, respectively the spatial and linguistic association parameters of model 4

Fig. 3

Akaike weights for 44 traits of the WNAI database, sorted into six domains. The weights, shown schematically as shaded areas of the horizontal bars, indicate the relative support for each model of the trait, compared to the other three models. The best-supported model among M1–4 has the largest weight. Models 1–4 are briefly described in Table 1. The calculation of model weights using information criteria is described in the Model Comparison section of the text. (Traits and trait descriptions are provided in the ESM, Table 2)

For a given trait under model m, the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) are defined respectively as \( {\text{AI}}{{\text{C}}_m} = - {2} {{\text{L}}^{*}}_m + {2} {{\text{K}}_m} \) and \( {\text{BI}}{{\text{C}}_m} = - {2} {{\text{L}}^{*}}_m + { \log }\,\,n\,{{\text{K}}_m} \), where L*m is the maximized likelihood (1) under model m, and Km is the number of parameters in model m (Burnham and Anderson 2002). The model for a trait with the smallest AIC (and, respectively, BIC) is understood to provide the best balance between goodness of fit and model complexity. The method for calculating information criterion differences among models using numerical likelihoods is described in the ESM. We use Akaike weights to compare the relative evidence for each model of a trait (Burnham and Anderson 2002). The Akaike weight for model m is
$$ { }{w_m} = {{{{ \exp }\left\{ {{{{ - {\Delta_m}}} \left/ {2} \right.}} \right\}}} \left/ {{\left( {{ }{\Sigma_l}{ \exp }\left\{ {{{{ - {\Delta_l}}} \left/ {2} \right.}} \right\}} \right)}} \right.} $$
where \( {\Delta_m} = {\text{AI}}{{\text{C}}_m} - {\text{AI}}{{\text{C}}_{{{ \min }}}} \), AICmin is the smallest AIC among models 1–4, and the summation is over l = 1,…4. (An analogous calculation is used to find the BIC weights.) The weights, which sum to 1 over models for a given trait, can be used to compare the relative evidence in favor of each model, with more likely models receiving larger weights (Burnham and Anderson 2002).

Validating the Method by Exact Calculation and Simulation

The MCMC method circumvents evaluation of the normalizing constant z (eq. 1) but produces a stochastic approximation to the true likelihood. We wish to learn whether or not the approximate method produces parameter estimates and information measures close to those that would be obtained by exact calculation. Evaluation of the normalizing constant z is feasible in small samples, using an integer-based enumeration technique (see ESM). We therefore compared MCMC parameter estimates and model weights with analogous results obtained by exact evaluation of the likelihood, in a subsample of societies and traits. We found that MCMC reproduced exact results to accuracy 10−2, the level of precision we employ throughout the study. Details of the exact study are in the ESM.

We designed a simulation study to learn whether or not autologistic models respond systematically to variation in the level of horizontal transmission between societies (a “proof of principle”). We generated 200 datasets using the spatio-temporal simulation program of Nunn et al. (2006), with horizontal transmission rates varying between 0.0 and 0.1. The simulated datasets contained 100 societies, each society having a binary trait value, a position on a two-dimensional lattice, and a known phylogenetic lineage. In order to produce datasets conforming to the WNAI sample, we converted the simulated phylogenetic trees to phylogenetic neighbor graphs, calibrating these with the spatial neighbor graphs obtained directly from the lattice (see ESM). We developed a semi-automated batch computing routine to fit autologistic models (using methods described above) to the simulated datasets. We then compared the relative support for models 1–4 for each simulated dataset. We found (see ESM) that autologistic parameter estimates and model weights respond in a predictable way to variation in horizontal transmission rates: on average, model 2 (phylogenetic neighbors only) is favored when horizontal transmission is rare; model 3 (spatial neighbors only) is favored when horizontal transmission is frequent. For intermediate levels of horizontal transmission, models 2, 3, and 4 (the last: both phylogenetic and spatial neighbors) may all be supported by the data.

Adequacy of Binary Neighbor Graphs

A methodological question that arises concerning our method is whether or not binary neighbor graphs can adequately capture associations among societies. A related question is whether or not graphs having qualitatively different topologies can be used fairly in the same model (in the WNAI sample, the linguistic neighbor graph is formed of exclusive cliques determined by subfamily membership, while the spatial neighbor graph is relatively more connected—see Figs. 1a, b). Theoretical results for the Ising model of statistical physics (from which the autologistic model is derived) show that for well-connected graphs, rich association structures, including long-distance correlations between societies, can be generated readily by binary neighbor relationships (Pickard 1987; Ripley 1988). The answer to the question of adequacy is then a rather definite “yes,” for the WNAI spatial neighbor graph and others like it. The comparative cliquishness of the linguistic neighbor graph results from limited information about language history in the region, and from our use of the uncontroversial but conservative classifications of Mithun (1999) and Campbell (1997). Additional information connecting language groups to one another (provisional on agreement among linguists) could be incorporated into the graph, perhaps using second- and higher-order neighbor relationships (Besag 1974). Neighbor graphs are flexible enough to be adapted to the information a researcher has at hand. Finally, our simulation study (ESM) involved phylogenetic neighbor graphs composed of many small cliques, in contrast to highly connected spatial neighbor graphs. The contrasting topologies were no impediment to recovering differential signals of vertical and horizontal transmission in simulated datasets.


Spatial and Linguistic Effects

The largest estimates of spatial effects (θ) occur for traits falling in the Material Culture domain, including whether salt is added to food (salt), the method of drying meat (drymeat), whether stone food mortars are used (mortar), and some methods of house coverings (barkmat and stoneearth, but not hidethatch) (Fig. 2). The other such trait concerns whether local agricultural products contributed to the diet (agrodiet), a trait limited to a small area in the Southwest. The largest estimates of linguistic effects (λ) are also in some cases found for Material Culture traits, specifically the form of digging sticks, whether milling stones were used, and whether hide or thatch house coverings were used. Population density (popdens) and ownership of houses by descent units (ownhouse) also had large estimates of λ. Even for traits with high spatial or linguistic effects, a full model that includes both is generally favored (see next section).

Using Information Criteria to Compare Models across Traits

The AIC and BIC weights (see “Methods”) associated with models 1–4 for 44 traits of the WNAI database are shown in Fig. 3. M4, which predicts traits using both linguistic and spatial neighbors, receives a very high Akaike weight (0.95 or higher) for 28 of 44 traits and has the highest Akaike weight among the four models for 40 of 44 traits. BIC weights follow a similar pattern, though more weight is often given to the second-best (and, in all cases, simpler) model. For instance, looking at agriculture (agrodiet), the Akaike weights for M4 and M3 are 0.90 and 0.10, respectively, while the BIC weights are 0.65 and 0.35. Nevertheless, M4 still receives the highest BIC weight in the majority of cases (37 of 44 traits). For some traits, a weight for M3 (spatial neighbors only) or M2 (linguistic neighbors only) is greater than 0.05, suggesting that a model incorporating only one of the two neighbor graphs may be entertained (Fig. 1). However, neither M3 nor M2 is the obvious second-best model overall, considering AIC and BIC weights.

To examine the fit of the models to the data, we used the maximum likelihood parameter estimates for θ, λ, and β for each trait to predict the trait value for each society in turn, given the known values of the society’s spatial and linguistic neighbors (see ESM). Across the 44 traits, the full model (M4) predicts an average of 83.3% of the societies’ values correctly (range 66.0–95.9%). This is a marked increase over the percentage correct if one were simply to guess the more frequent trait value for each society (average 61.2%, range 51.3–78.5%). On a per-society basis across the 44 traits, we can identify the societies for which M4 had the least predictive success. These societies tend to fall into two groups: those that did not have spatial or linguistic neighbors, and a portion of those societies found in the dense cluster along the northern California and southern Oregon coast.

Looking across Domains

Given that the results strongly favor M4 for almost all traits, any domain differences that do exist would appear to be subtle. In the two cases where M3 has the highest Akaike weight, both traits—stone food mortars (mortar) and stone or earth house covers (stoneearth)—fall under the “Material Culture” domain (BIC weights follow the same pattern). In the cases where M2 has the highest Akaike weight, one trait—bride price (brideprice)—falls under “Marriage and Residence,” and the other trait—possessional shamanism (shamantrance)—under “Rituals, Beliefs, and Attitudes.” Along with these traits, BIC weights favor M2 for an additional trait in each of the domains “Subsistence and Settlement” (huntdiet) and “Kinship and Family” (linealfamily).


Our novel use of the autologistic model allows us to demonstrate that most cultural traits in the WNAI are best predicted by the combined effects of both spatial and linguistic neighbors, and that neither spatial nor linguistic neighbors are inherently superior predictors of traits. Our results run counter to the claim that cultural trait distributions can be explained largely by the transmission of traits from parent to daughter populations (Pagel and Mace 2004) and are thus best analyzed with phylogenies. They also question the assertion that “many of the traits of most interest to anthropologists involve codified practices and ancient rituals with tighter intergenerational constraints that are likely to limit the impact of horizontal transfer” (Gray et al. 2008:369). Rather, they are consistent with more recent indications that even lexical traits, long thought to be largely treelike, show high levels of borrowing (e.g., Nelson-Sathi et al. 2010 infer that 61% of cognates have been affected by borrowing). In this study we show that for most traits in the WNAI database, accounting for only one mode of transmission would result in a loss of information. This does not mean that vertical transmission is not a critically important force of cultural macroevolution in many situations, most obviously where there have been recent and rapid human population expansions (Gray and Jordan 2000; Holden 2002). Rather it suggests we need greater methodological pluralism.

Our results also suggest that both modes of transmission are important regardless of the cultural domain. In other words, we find no supportive evidence for domain-specific transmission modes. How can we reconcile this with the oft-cited claims for such patterns? One answer is that previous studies do in fact show evidence for all modes of transmission, but that the results have been misinterpreted, in part owing to flawed inferences based on p-values (see Towner and Luttbeg 2007). For example, in the “Family and Kinship” domain, Guglielmino et al. (1995) find statistically significant (at a traditional p < 0.05 level) correlations with linguistic affiliation in 10 of 12 traits, and they find geographic clustering values that overlap in range with each of their other domains. Although the authors originally showed caution in their interpretations, their results have been reported by others (Pagel and Mace 2004), and indeed themselves (Hewlett et al. 2002), as showing strong evidence for vertical transmission of “Family and Kinship” traits.

The importance of both vertical and horizontal processes as well as the absence of domain-specific patterns reported here may be common in particular regions of the world, reflecting ecological heterogeneity as well as waves of human settlement (see “Materials” for a description of the WNAI region). Our study’s findings are consistent with other cross-cultural analyses of trait variation in this geographical region, including the substantive work of Jorgensen (1980). In his efforts to use trait correlation matrices to identify unique culture areas within the WNAI, Jorgensen notes that “no two topics created identical culture area taxonomies, and that the correlations among topics varied widely” (1980:96). Such assertions are borne out by more recent studies by Jordan and his collaborators focusing on specific regions and traits within western North America (Jordan and Mace 2006, 2008; Jordan and O’Neill 2010; Jordan and Shennan 2003). Distinct patterns of trait transmission, even within a domain such as “Material Culture,” appear predicated on the ways in which traits are linked to notions of territoriality and identity (e.g., potlatch ceremonies), as well as how traits are produced. For example, in the Pacific Northwest, vertical transmission characterized tasks that required specialist individual knowledge and tools (e.g., canoe-building) or substantial coordination of close relatives (e.g., building plank houses), while portable crafts (e.g., basket production), particularly those carried out by women, may have flowed more readily along patterns of postmarital residence and trade routes.

The evidence above joins that from a number of other studies in suggesting that the phylogenetic signature found in cultural data is highly variable (Collard et al. 2006; Gray et al. 2008; Moylan et al. 2006) and can be quite weak—for example, in studies of brasswind cornets (Tëmkin and Eldridge 2007), Christian sects (Venti 2004), particular features of Iranian carpets (Tehrani and Collard 2009), and some parts of the Germanic language tree (Bryant et al. 2005). Clearly more empirical work is needed across a much wider array of traits to determine the extent of horizontal transmission in cultural traits, addressing linked questions of whether borrowing is constant over time or concentrated in temporal pulses, whether it affects some domains more than others, and how it varies across continents and historical eras. Here, like Gray et al. (2008), we call for more empirical research, encouraging investigators to use approaches, such as the one demonstrated here, that evaluate multiple influences on cultural trait patterning within a single explicit model.

Methodological Issues

Turning to methodological issues, we avoid phylogenetic methods in this analysis not to take an anti-evolutionary stance on cultural macroevolution—such methods play a valuable role in identifying population homelands, dating population divergences, inferring the cultural traits of ancestral populations, and assessing rates of cultural change (Gray et al. 2008). Where true trees can be determined from independent data, as with the history of brass instruments over the past two centuries, horizontal transfer events can be accurately calculated on a yearly basis (Tëmkin and Eldridge 2007). That said, elaborations on phylogenetic methods are not necessarily the best choice for trying to detect non-treelike transmission given that phylogenies privilege history over geography. They are certainly not well-suited for a balanced comparative investigation of the signatures of horizontal and vertical transmission. Our point is therefore consistent with Nelson-Sathi et al.’s (2010:2) recognition that “borrowing is a non-tree-like evolutionary event that cannot be reconstructed using phylogenetic trees.”

Let us step back a bit to justify this position. Essentially the situation with culture is rather like that with genes—each cultural trait of a society, like each gene in an organism, can have its own history (Boyd et al. 1997). Thus, just as gene trees are not always congruent with species trees (Maddison and Maddison 1997), so cultural trait trees are not always congruent with population history (see Pocklington 1996, reproduced in Borgerhoff Mulder et al. 2006). The challenge now is to continue developing a diversity of methods, in the tradition of “the broad church . . . of evolutionary anthropology” described by Shennan (2009:1). One promising route forward might lie in exploring the potential for phylogenetic methods to map the incongruent trees of traits with diverse histories, as proposed by Gray et al. (2008) and demonstrated by Tehrani et al. (2010), taking the lead from biological studies in which the history of one group of entities is determined in part by the history of others (as in host parasite coevolution or genes that jump from one species to another; Page and Charleston 1998). Another route lies in taking a less phylogenetically based approach and exploring the signal of language and geography using a more balanced method, as we demonstrate here. Which approach is better depends, as always, on the specifics of the question and the particular data available. In the case of the WNAI, given unresolved questions about the deeper linguistic connections among North American languages (Mithun 1999), we believe that non-phylogenetic methods are more appropriate and tractable (see Sellen and Hruschka 2004; Borgerhoff Mulder et al. 2006) and therefore favor the use of neighbor graphs in this study.

Our method compares more closely with non-phylogenetic methods reviewed in the introduction. For example Dow’s (2007) “network autocorrelation effects” model is a Gaussian analogue of the autologistic model we use here. In Dow’s model, as in the autologistic model, a society’s trait depends on the traits of its spatial and linguistic neighbors; Dow extends the model by introducing ecological “neighbors,” along with regressor variables. Exact estimation for Dow’s model is challenging, as it is for the autologistic model. Dow (2007) uses instrumental variables as proxies for the autocorrelation effects, obtaining model estimates by a two-stage least-squares procedure. Similarly Holden and Mace (1999) use logistic regression to examine simultaneously the effects of geographic and linguistic distance on sexual dimorphism.

It is clear that geographical distance between two extant populations is an imperfect measure of the potential for horizontal transmission in the past, partly because groups may have moved, partly because of topographical barriers and conduits, such as mountain ranges and river courses. The coarseness of this measure may be somewhat compensated for by our use of a 175 km cutoff. Similarly the use of a relatively shallow language classification may underestimate phylogenetic signal, but (as noted above) this is the more conservative approach when deeper language histories are contested. Lastly, treating both linguistic similarity and spatial proximity as binary traits may underestimate signal, although this lack of precision is likely to affect estimates of both transmission routes similarly.

Our broad methodological aim—to compare the strengths of historical and spatial signals within a single modeling framework—is shared by Freckleton and Jetz (2009), though they use data of a different form than ours and consequently develop an approach that differs in the details. Specifically, Freckleton and Jetz (2009) are concerned with continuously varying traits of three mammal orders for which there are relatively well-resolved phylogenies. These elements combine in a natural way with point-referenced geographical distances, leading to models focused on the structure of variance/covariance matrices for multivariate Gaussian data. Our discrete and relatively coarse data led us to models of another form, requiring statistical and computational methods different from those of Freckleton and Jetz (2009).

Pattern and Process in Cultural Macroevolution

As with all comparative studies based on contemporaneous data, the results inform us about patterns in the data but give only indirect insights into the actual process of trait transmission that happened in the past. Macroevolutionary patterns shed only indirect light on the actual processes entailed (Eerkens et al. 2006; Mesoudi and O’Brien 2009), and indeed almost all studies going back to Guglielmino et al. (1995) stress caveats with regard to over-interpreting pattern for process. A trait that appears to be strongly vertically transmitted should not necessarily be assumed to reflect retention of an ancestral trait—it could result from environmental adaptation if linguistically related populations disperse into similar habitats (i.e., habitat selection). Likewise spatial associations not only may result from borrowing (i.e., horizontal transmission) but could also reflect the shared history of populations living in close proximity or convergent evolution to a similar environment (e.g., Sellen and Hruschka’s 2004 analysis of marriage and resource-defense in the WNAI).

This is a general problem. Biologists are well aware of the difficulties in determining whether history or environment is the better predictor of cross-species variation in trait values (e.g., Freckleton and Jetz 2009), and of the fact that phylogenetic niche conservatism confounds the interpretation of phylogenetic signal and correlated evolution (Buckley et al. 2010; Freckleton and Jetz 2009). Similarly cultural evolutionists recognize that the ambiguous areas depicted in NeighborNet graphs (boxed areas showing the influence of multiple populations) could reflect convergent evolution rather than horizontal transmission (Gray et al. 2008).

Accordingly the significance of horizontal transmission that we infer from the strong performance of model 4 could easily reflect the settlement of daughter populations near their parents or convergent evolution to similar habitats. To some extent this interpretational problem is averted because in our sample linguistic and spatial neighbors are not coterminous: WNAI linguistic neighbors (by our definition) have only a 0.29 probability of being spatial neighbors, and spatial neighbors (by our definition) only a 0.32 probability of being linguistic neighbors. More generally, even if daughter populations do settle at random with respect to their parents and model 3 consequently detects only horizontal transmission, the pattern we have observed could reflect many variants of horizontal transmission: some societies might inherit their traits vertically and others acquire theirs horizontally, some traits might be inherited vertically and others horizontally, or all traits might be influenced by a more blended system of both vertical and horizontal forces; the trait-specific analyses of Jordan and collaborators suggest all such dynamics are plausible.

Given the familiar caveats we have reviewed above, we advocate for far more empirical work aimed at detecting spatial and linguistic patterning. Data from such efforts must be evaluated on a level playing ground in order to compare alternative models fairly. Our paper provides such a principled method to infer effects of spatial proximity and shared ancestry. Using this approach we find evidence of both spatial and linguistic signals in trait variation across Western North American Indian societies.


For an annotated copy of our code, contact or



We thank Eric Alden Smith, Bret Beheim, Robert Boyd, Victor Golla, Shelly Lundberg, Barney Luttbeg, Marc Mangel, Richard McElreath, Charlie Nunn, Peter Richerson, Bruce Winterhalder, and other members of the HBE lab group at UC Davis for their encouragement, comments and discussion. John Gillespie introduced us to the enumeration method used in the exact estimation study. The research was support by a grant from the National Science Foundation (SBE Cultural Anthropology Program 0546119).

Supplementary material

12110_2012_9142_MOESM1_ESM.docx (468 kb)
ESM 1(DOCX 467 kb)

Copyright information

© Springer Science + Business Media, LLC 2012

Authors and Affiliations

  • Mary C. Towner
    • 1
  • Mark N. Grote
    • 2
  • Jay Venti
    • 2
  • Monique Borgerhoff Mulder
    • 2
  1. 1.Department of ZoologyOklahoma State UniversityStillwaterUSA
  2. 2.Department of AnthropologyUniversity of CaliforniaDavisUSA

Personalised recommendations