A weighted difference barrier method in landscape genetics
- First Online:
- Received:
- Accepted:
- 2 Citations
- 894 Downloads
Abstract
Identifying barriers of species and characterize their effects on spatial distribution provide essential information to research in landscape genetics. We propose a weighted difference barrier (WDB) method as an alternative to maximum difference barriers (MDB), and to initiate and integrate more spatial modeling and methods into the problem solving process. Overall, WDB provides quick and straightforward improvements to the drawbacks of MDB. WDB integrates more sample location relationships into the barrier construction and reveals potential barriers that would otherwise go undetected. WDB incorporates both within group and between group genetic information, and delineates the barriers as a more complex pattern.
Keywords
Barrier Weighted Voronoi Genetic distance Spatial analysis Landscape geneticsJEL Classification
C61 C651 Introduction
Understanding of geospatial patterns of genetic variation advances the knowledge of population genetics in addition to statistical and mathematical modeling (Epperson 2003). Landscape genetics is an effective approach for examining the influence landscape and environmental features have on population genetic structures. Although landscape genetics has deep roots in landscape ecology, population genetics, biogeography and phylogeography, it has only recently emerged as a field due to the increasing application of microsatellites (short, repetitive segments of DNA; in contrast to micro satellite, which is a type of mini satellite in remote sensing technology). Two fundamental aspects in landscape genetics are the detection of genetic discontinuities (barriers) and the correlation and explanation of the discontinuities with landscape features. Using genetic data collected by microsatellite markers, GIS and statistical methods have been effective barrier detectors (Guillot et al. 2005; Manel et al. 2003; Manni et al. 2004; Osakabe et al. 2005; Radke 1998). Barrier detection methods include isolated distance barriers (IDB), maximum difference barriers (MDB) and statistical methods.
Detecting barriers or establishing bounded point sets is a critical step in decomposing observations or data points into meaningful objects to assist spatial characterization and pattern recognition. With barriers identified and mapped, patterns of densities, distances, directions and shape can be classified, assisting in hypotheses generation, testing and eventually the explanation of form.
Genetic distance is an important measure for many indices calculated in landscape genetics, and it serves as an important baseline in barrier detection. The relationship between genetic distance and geographic space helps answer questions such as gene flow, population structure, and species distribution forces. Geographic distribution of species is mainly determined by historical accidents. Barriers between species can be versatile geographic features, and are changing over times (Slatkin 1987). In some instances they are the result of species invasion, while at other times barrier patterns may simply map out as the result of species succession. In population genetic models such as island models, it is believed since genetic distance is a metric of how populations organized spatially, that geographic distance and genetic distance are approximate when calculated across a simple landscape (Bowser 1996; Slatkin 1993; Weir 1990).
Genetic distance is also the key to link geospatial approaches to landscape genetics research. There are many methodological discussions of genetic distance, diversity and differentiation (Hamrick and Godt 1990; Hedrick 2005; Weir and Cockerham 1984; Culley et al. 2002; Nei 1973). Monmonier’s 1973 algorithm of MDB has been widely adopted in landscape genetics to detect boundaries (Manni et al. 2002, 2004). The maximum difference is calculated based on genetic distance. MDB connects sample locations with TIN (or Delaunay triangulation), and assigns genetic distances as values of the edges of TIN. MDB initiates the barriers from the largest genetic distance. Barriers computed from MDB always bisect TIN edges and align with boundaries of ordinary planar Voronoi diagrams (the mathematical dual of the TIN).
MDB has been applied to combine geographic and genetic information to identify genetic zones of plant species such as Manchurian ash across north-east China (Hu et al. 2008); of animal species such as land snail in the Western Mediterranean (Guiller et al. 2006), and common vole in northeast Poland (Ratkiewicz and Borkowska 2006); and of aquatic ecosystems such as yellow perch in Québec, Canada (Leclerc et al. 2008), scallops in the USA and Canada, and wild sea beet along the European Atlantic coast (Fievet et al. 2007). MDB is also utilized in human biology, an example exploring surnames in Spain (Boattini et al. 2007).
Successful adaptations of MDB have been used to geographically demonstrate genetic structure in combination with statistical methods such as spatial analysis of molecular variance (Santos et al. 2008; Guiller et al. 2006). In addition, since geographic distance and barriers are crucial considerations in many genetic barrier analyses, GIS, spatial algorithms and models are sought after as the demand for more effective integration of geographic and genetic information increases (Michels et al. 2001).
Although MDB’s principle follows Tobler’s first law of geography (Tobler 1970) which is an appropriate first order ingredient for constructing indices in landscape genetics, we argue genetic barrier delineation should also consider attributive weight (e.g., measures of genetic attributes). We present a weighted difference barrier (WDB) method to improve the identification of discontinuities in landscape genetics.
2 Weighted difference barrier (WDB) method
Barrier delineation of discrete point data is a problem of spatial tessellations. Among the existing barrier detection methods, especially the MDB, there are some limitations that need attention. For example, although MDB is better at finding predefined genetic barriers, it could also lead to division of populations not differentiated genetically (Dupanloup et al. 2002). In addition, MDB only includes TIN-neighbors, two points connected by a TIN edge, in genetic distance consideration. If two sampling locations are not TIN-neighbors, although there might be a barrier between them, the MDB method cannot detect it. Furthermore, the bisection between a pair of sampled locations overlooks the genetic differences between the two samples. For instance, no matter how different those two samples are, the barrier between them is defined as the bisector, which cannot be reasoned genetically or geographically. We propose a weighted difference barrier (WDB) method to mitigate these limitations.
The MDB method uses the ordinary Voronoi to delineate the barriers between sample locations. Our WDB method incorporates a weighted Voronoi to generate the barriers using genetic characteristics, such as gene diversity, as the weight. The weight assignment scheme is based on research results where there is a positive correlation between gene diversity and the size of patch area (Banks et al. 2005; Osakabe et al. 2005), and that gene diversity has insignificant relationships with fragmentation (Banks et al. 2005). At the species level, although the total species diversity is not significantly correlated with any variables of landscape patterns, large forest reserves tend to have relatively infrequent species. Therefore, large patches of natural forests are regarded as one of the important factors in preserving infrequent species (Fukamachi et al. 1996).
Genetic distance considers between group variation, while gene diversity within group variation. Since MDB is restricted with distance only, it likely overlooks the within group variation. In contrast, our WDB incorporates both between group and within group variation. A weighted Voronoi diagram overcomes the major shortcomings of the ordinary Voronoi, and takes both location and attribute information into the consideration while generating the final spatial tessellation of a point set. Although detailed definitions of weighted Voronoi diagrams exist in the computational geometry literature (Aurenhammer and Edelsbrunner 1984; Okabe et al. 2000), we present one here.
Let S be a finite set of points in the Euclidean plane, p and q denote two points in the plane. Let the weights of the two points be w(p) and w(q). Let x be any point in the plane. The Euclidean distance between x and p is d_{e}(x, p), and the weighted distance between x and p is d_{mw}(x, p). Let region(p) denote the dominant region of point p, that is, p’s influence region in S. The following can be defined.
Although there are various algorithms to define a weighted Voronoi, such as the additive weights Voronoi,^{1} the compound weighted Voronoi,^{2} or the power weighted Voronoi^{3} (Okabe et al. 2000), we employ the above-mentioned MW-Voronoi to construct the WDB. The essential idea of these generalized Voronoi diagrams (other than ordinary) is to show by incorporating weight, dimension and other considerations into constructing Voronoi diagrams, how different their resultant spatial patterns would be, as well as their spatial and attributive relationships. The MW-Voronoi serves as a vessel to demonstrate this idea of spatial change, and other weighted models are left to be explored in future research.
Simulated allele frequency and gene diversity
Population | Allele1 | Allele2 | Allele3 | Gene diversity |
---|---|---|---|---|
1 | 0.686 | 0.055 | 0.259 | 0.460 |
2 | 0.346 | 0.023 | 0.631 | 0.482 |
3 | 0.538 | 0.131 | 0.331 | 0.584 |
4 | 0.023 | 0.689 | 0.288 | 0.442 |
5 | 0.166 | 0.361 | 0.473 | 0.619 |
6 | 0.556 | 0.078 | 0.366 | 0.551 |
7 | 0.320 | 0.366 | 0.314 | 0.665 |
8 | 0.850 | 0.104 | 0.046 | 0.264 |
9 | 0.870 | 0.070 | 0.061 | 0.235 |
10 | 0.176 | 0.672 | 0.152 | 0.494 |
11 | 0.726 | 0.186 | 0.087 | 0.430 |
12 | 0.354 | 0.494 | 0.152 | 0.607 |
13 | 0.157 | 0.238 | 0.605 | 0.553 |
14 | 0.651 | 0.013 | 0.336 | 0.463 |
Genetic distance of the simulated data
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | ||||||||||||||
2 | 0.274 | |||||||||||||
3 | 0.026 | 0.154 | ||||||||||||
4 | 1.452 | 0.962 | 0.889 | |||||||||||
5 | 0.571 | 0.200 | 0.307 | 0.171 | ||||||||||
6 | 0.025 | 0.126 | 0.004 | 1.067 | 0.343 | |||||||||
7 | 0.282 | 0.272 | 0.142 | 0.211 | 0.068 | 0.188 | ||||||||
8 | 0.049 | 0.640 | 0.130 | 1.813 | 0.972 | 0.145 | 0.426 | |||||||
9 | 0.043 | 0.613 | 0.127 | 2.031 | 1.002 | 0.137 | 0.450 | 0.001 | ||||||
10 | 0.976 | 1.090 | 0.678 | 0.039 | 0.245 | 0.839 | 0.162 | 0.992 | 1.091 | |||||
11 | 0.044 | 0.558 | 0.092 | 1.196 | 0.710 | 0.117 | 0.287 | 0.010 | 0.016 | 0.701 | ||||
12 | 0.399 | 0.675 | 0.281 | 0.176 | 0.225 | 0.367 | 0.059 | 0.404 | 0.445 | 0.065 | 0.267 | |||
13 | 0.572 | 0.087 | 0.312 | 0.379 | 0.037 | 0.314 | 0.167 | 1.129 | 1.127 | 0.531 | 0.872 | 0.458 | ||
14 | 0.008 | 0.187 | 0.020 | 1.511 | 0.511 | 0.011 | 0.286 | 0.097 | 0.087 | 1.096 | 0.092 | 0.468 | 0.464 |
- (i)Construct weighted Voronoi polygons. For our sample data, we used MW-Voronoi construction based on pair-wise relationships of Apollonius circles (Aurenhammer and Edelsbrunner 1984; Mu 2004; Okabe et al. 2000). Since each boundary, a line or arc segment, represents a separation between two points, a one-to-many relationship between a genetic distance (of two sampling populations) and the weighted Voronoi polygon boundaries can be built. The “many” part in the one-to-many relationship is due to a geometric property of the weighted Voronoi polygons, that the boundaries between two points might be multi-parts and discontinued (Fig. 1).
- (ii)
Calculate genetic distance for all pairs of points that share a weighted Voronoi boundary. Table 2 shows there are 34 out of 91 pairs of genetic distances from the sample data that satisfy the criteria and are highlighted in bold. Join the genetic distance values to the corresponding weighted Voronoi boundaries.
- (iii)
- (iv)
The weighted barrier is then extended in both directions following the weighted Voronoi boundaries associated with the highest distance. In Fig. 2, 0.972 instead of 0.171, then 0.404 instead of 0.225. The process is continued until it has either formed a closed region around a population, e.g., point 8, or reached the outer limit of the study area. The result is the first level barrier, which bisects the whole space to two regions: the enclosed region surrounds point 8, and the rest of the space.
- (v)Depending on the data set, the weighted barrier could be multi-levels. Each upper level barrier bisects the region it belongs to. From each of the bisected regions, the next lower level of weighted barriers are formed following the same criteria as outlined above. Figure 3a shows three levels of WDB barriers formed by the sample data set. For comparison purposes, Fig. 3b shows three levels of MDB barriers formed by the same data set.
3 Testing the WDB method with empirical data
Population heterozygosity and pairwise genetic differentiation of sample data (compiled from Kenchington et al. 2006)
Population (heterozygosity) | 1 (0.739) | 2 (0.710) | 3 (0.722) | 4 (0.674) | 5 (0.701) | 6 (0.793) | 7 (0.796) | 8 (0.762) | 9 (0.727) | 10 (0.738) | 11 (0.724) | 12 (0.687) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1. Annapolis | ||||||||||||
2. Digby | −0.001 | |||||||||||
3. Lurcher | −0.003 | −0.001 | ||||||||||
4. Browns | −0.002 | −0.0001 | 0.002 | |||||||||
5. Georges (Can) | −0.003 | 0.002 | −0.0001 | 0.005 | ||||||||
6. Georges (US) | 0.041 | 0.043 | 0.034 | 0.044 | 0.046 | |||||||
7. NJ | 0.001 | 0.002 | 0.003 | 0.001 | 0.008 | 0.023 | ||||||
8. Western | 0.001 | 0.002 | 0.007 | 0.001 | 0.006 | 0.040 | 0.001 | |||||
9. PEI | 0.008 | 0.006 | 0.008 | 0.007 | 0.010 | 0.050 | 0.008 | 0.008 | ||||
10. Gaspé | 0.023 | 0.025 | 0.022 | 0.024 | 0.027 | 0.006 | 0.009 | 0.020 | 0.028 | |||
11. Nfld (SP) | 0.018 | 0.015 | 0.021 | 0.012 | 0.019 | 0.061 | 0.017 | 0.017 | 0.003 | 0.034 | ||
12. Nfld (TB) | 0.001 | 0.005 | 0.003 | −0.002 | 0.007 | 0.043 | 0.007 | 0.005 | −0.003 | 0.024 | 0.005 |
First, there are more shape variations in the WDB boundaries than in the MDB boundaries. Instead of straight lines only and instead of always bisecting two population locations in the MDB, WDB boundaries can be curved and run between two locations based on genetic information such as heterozygosity. For example, the heterozygosity of Georges (Can), 0.701 is smaller than that of Georges (US), 0.793, so the WDB between them is concaved toward the Canadian site showing a relatively smaller and enclosed region. Such a spatial pattern corresponds to the relationship that population with larger gene diversity tends to have larger patch sizes (Banks et al. 2005; Osakabe et al. 2005).
Second, the first barrier of the MDB isolates the site of Georges (US) from others, and the first barrier of the WDB isolates not only that site, but also the Gaspé in the far north. The spatial formation is caused by possible multi-parts and discontinued areas of a weighted Voronoi polygon as described earlier. The pairwise differentiation between the Gaspé and George (US) is 0.006, and average pairwise differentiation between the Gaspé and all other sites except for Georges (US) is 0.022 (Table 3). Therefore, the scallop’s population from the Gaspé is more similar to those from Georges (US) than to the other ten sites, and the WDB method captures this relationship. This WDB delineation matches one of the observations in Kengchington’s (2006) work that “Georges (US) and the Gaspé are significantly differentiated from each other and all other Populations”.
Third, the hierarchy of barriers changes more rapidly in the WDB method. The east [Nfld(TB), Nfld(SP) and PEI) and west (Digby, Annapolis, Wester, Lurcher, Brownis, and Georges (Can)] clusters are formed at the second level of the WDB barrier, and Georges (Can) and Nfld (TB) are distinguished at the third level of the WDB. In contrast, in the MDB, the east and west clusters are not divided until the third level.
4 Discussion
The WDB method calculates genetic distance for weighted Voronoi neighbors as two points separated by a weighted Voronoi polygon boundary, and MDB calculates genetic distance for Voronoi neighbors as two points separated by a Voronoi polygon boundary. By doing so, both within group genetic information (gene diversity), and between group genetic information (genetic distance) are considered. Usually, the number of weighted Voronoi neighbors is larger than the number of Voronoi neighbors, indicating more relationships are being considered. In our sample data, 34 pairs of genetic distance are calculated for WDB and 31 for MDB.
Our WDB method tessellates sample points with a weighted Voronoi diagram. Genetic attribute values of each sample point determines the weight, thus the genetic discontinuities between points will not always be the bisection between them. The WDB boundaries are constructed based on weighted Voronoi and can generate hierarchical levels. Segments of WDB boundaries have more variations than those in MDB. They can be multi-parts and disconnected, and can be characterized beyond the straight lines of the MDB and often scribe circular curves.
5 Summary and conclusion
Identifying barriers of species and characterize their effects on spatial distribution provide essential background information to research in landscape ecology, population genetics, biogeography, historical biogeography, and phylogeography. Overall, WDB provides quick and straightforward improvements to the drawbacks of MDB. WDB integrates more sample location relationships into the barrier construction and reveals potential barriers that would otherwise go undetected. WDB incorporates both within group and between group genetic information, and delineates the barriers as a more complex pattern.
Besides the WDB, there are other techniques being explored for boundary delineation that make use of simulated annealing algorithm (Dupanloup et al. 2002), Bayesian criteria or specific distance-decay behaviors (Guillot et al. 2005; Santos et al. 2008; Culley et al. 2002; Guiller et al. 2006; Hull et al. 2008; Manel et al. 2003; Sambridge 1998). We argue the method introduced here is an alternative approach, and a beginning in initiating and integrating more spatial modeling and methods into the problem solving process. This raises an interesting discussion on whether gene diversity only should be applied to assign weights to each population site. Further research will explore other weighted attributes and test the method on data with genetic distances collected from microsatellite markers. Furthermore, embedded within a GIS environment, we explore the correlation of genetic discontinuities detected based on our weighted method and landscape features.
New spatial algorithms that decompose observations or data points into meaningful objects, presents us with a variety of ideas for delineating barriers. The WDB defines a more appropriate model and logically should map more realistic barriers. These new benchmarks can prove quite useful in characterizing spatial patterns and can lead to more enlightened hypotheses or at the very least, help us ask more intelligent questions. Built upon this additional understanding of geospatial genetic variations, future research should be extended to not only the static forms, but also dynamic processes of landscape genetics.
The additively weighted Voronoi diagram (AW-Voronoi): d_{aw}(x, p) = d_{e}(x, p) − w(p), and region(p) = {x|d_{aw}(x, p) ≤ d_{aw}(x, q), q in S}.
The compoundly weighted Voronoi diagram (CW-Voronoi): Let w1(p) be the multiplicative weight of point p, and let w2(p) be the additive weight of point p. The compoundly weighted Voronoi diagram is actually a combination of MW-Voronoi and AW-Voronoi, where d_{cw}(x, p) = d_{e}(x, p)/w1(p) − w2(p), and region(p) = {x|d_{cw}(x, p) ≤ d_{cw}(x, q), q in S}.
The power diagram (PW-Voronoi): d_{pw}(x, p) = d_{e}^{2}(x, p) − w(p), and region(p) = {x|d_{pw}(x, p) ≤ d_{pw}(x, q), q in S} (Okabe et al. 2000).
F_{ST} is the reduction in diversity (heterozygosity) expected with random mating at one level of population hierarchy relative to another more inclusive level. F_{ST} = (H_{T} − H_{S})/H_{T}, where H_{T} is the genetic diversity within the total population, and H_{S} is the mean of all subpopulation diversity (Wright 1951).
G_{ST} is defined as the proportion of genetic diversity that resides among populations. It is equivalent to Wright’s (1951) F_{ST} when there are only two alleles at a locus, and, in the case of multiple alleles, G_{ST} is equivalent to the weighted average of F_{ST} for all alleles (Nei 1973).
Weir and Cockerham’s θ is the unbiased estimator of F_{ST} that corrects for error associated with incomplete sampling of a population. \( \hat{\theta} = a/(a + b + c), \) where a = the variance between population, b = the variance between individuals within populations, c = the variance between gametes within individuals (Weir and Cockerham 1984).
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.