Identifying future research needs in landscape genetics: where to from here?
- First Online:
- Cite this article as:
- Balkenhol, N., Gugerli, F., Cushman, S.A. et al. Landscape Ecol (2009) 24: 455. doi:10.1007/s10980-009-9334-z
- 1.1k Views
Landscape genetics is an emerging interdisciplinary field that combines methods and concepts from population genetics, landscape ecology, and spatial statistics. The interest in landscape genetics is steadily increasing, and the field is evolving rapidly. We here outline four major challenges for future landscape genetic research that were identified during an international landscape genetics workshop. These challenges include (1) the identification of appropriate spatial and temporal scales; (2) current analytical limitations; (3) the expansion of the current focus in landscape genetics; and (4) interdisciplinary communication and education. Addressing these research challenges will greatly improve landscape genetic applications, and positively contribute to the future growth of this promising field.
KeywordsLandscape resistanceAdaptive genetic variationGene flowSingle-nucleotide polymorphismsSpatial heterogeneitySpatio-temporal scale
Landscape genetics is a new and rapidly evolving interdisciplinary field that combines concepts and methods from population genetics, landscape ecology and spatial statistics. Landscape genetics explicitly quantifies the effects of landscape composition, configuration and matrix quality on spatial patterns in neutral and adaptive genetic variation and underlying microevolutionary processes. Landscape genetic concepts and approaches have been reviewed in several recent papers (Manel et al. 2003; Holderegger and Wagner 2006; Storfer et al. 2007; Holderegger and Wagner 2008). Interest in landscape genetics is steadily increasing, as shown by the growing number of papers published and symposia held on the topic. As a follow-up to a special issue on landscape genetics in this journal (Holderegger and Wagner 2006), a landscape genetics symposium and workshop were held at the 2007 IALE World Congress in Wageningen, The Netherlands, with the goal of developing a research agenda for landscape genetics. Here we report on four key challenges that landscape genetics currently faces.
Among these challenges are the identification of appropriate spatial and temporal scales (challenge 1) and current analytical limitations when testing for landscape-genetic relationships (challenge 2). Another challenge is related to expanding the focus of landscape genetics from the assessment of gene flow to analyses of the distribution and spread of adaptive genetic variation (challenge 3). Finally, interdisciplinary communication and education are major tasks for the progress of landscape genetics as they will be mandatory for the field’s future development (challenge 4).
Challenge 1: Spatial and temporal scale issues
Organisms exploit their environments over spatial scales and time periods that are species-specific and may further depend on life-stage, sex, or season. Each landscape genetic study thus needs to address the appropriate spatial and temporal scale, further considering whether the data provides the spatio-temporal resolution needed to answer particular questions. A major challenge for landscape genetic studies is the potential mismatch of the temporal and spatial scales of landscape and genetic data. While the genetic data are an amalgamation of historical and current processes, the landscape data available usually reflect contemporary configurations. Thus, current landscape characteristics are used to explain genetic patterns that have potentially evolved over many generations, often reflecting several decades or even centuries and millennia (Orsini et al. 2008; Scandura et al. 2008). To address these concerns, researchers need to choose appropriate genetic markers, find meaningful combinations of landscape and genetic data, and account for confounding historic influences.
Molecular markers can be carefully selected to match the research questions and spatio-temporal scales by considering not only mutation rate and variability, but also genome representativeness, inheritance and ploidy level (Brumfield et al. 2003). For a large-scale study on ancient processes (e.g., ice-age related migration), uniparentally inherited, haploid markers of the slowly evolving organellar DNA (i.e., mitochondria or chloroplasts) are well suited, since they have lower mutation rates and smaller effective population sizes than nuclear loci at neutral sites (Petit et al. 2005; Petit and Vendramin 2007). For example, the entire range of the endemic Tehuantepec jackrabbit (Lepusflavigularis) spans ca. 50 km around the southern part of the Isthmus of Tehuantepec, Oaxaca, Mexico. Using mitochondrial markers, Rico et al. (2008) found two highly divergent clades, corresponding to distinct evolutionary lineages, physically separated by the isthmus. The slowly evolving, maternally inherited mitochondrial markers, however, cannot be used to infer the role of gene flow by male dispersal or how such gene flow may have been altered by recent habitat fragmentation. Nuclear markers are particularly useful for studies at smaller spatial scales (i.e., local or regional), and generally correspond to finer temporal scales. Co-dominant nuclear microsatellites (= simple sequence repeats, SSRs; Selkoe and Toonen 2006) or dominant amplified fragment length polymorphisms (AFLPs; Meudt and Clarke 2007) are currently the most widely used nuclear markers, but technical development may soon replace them with single-nucleotide polymorphisms (SNPs; Morin et al. 2004). Obtaining and analyzing SNP data has great potential for molecular research, because SNPs are distributed across the entire genome, established technologies facilitate large-scale data collection, and mutational models for SNPs are relatively well known. Thus, a genome-wide scan of hundreds or thousands of SNPs allows researchers to account for the large among-locus variation in population genetic parameters (Morin et al. 2004).
In addition to adequate marker choice, landscape and genetic data should refer to similar temporal scales. Thus, FST-based estimates of genetic differentiation (or their inter-individual equivalents like ar; Rousset 2000) should be matched with historical landscape data (e.g., derived from aerial photographs or historical maps), while estimates of contemporary gene flow (e.g., from assignment tests or parentage analysis) should be related to present-day landscape configurations (e.g., Kamm 2008; Vandergast et al. 2007). Oftentimes, it will not be possible to associate a distinct temporal scale to population genetic parameters, or determine an exact time scale on which landscape changes affect genetic measures. Landscape data from multiple time periods can be used in such cases to determine which temporal scale best corresponds to observed genetic patterns. For instance, Holzhauer et al. (2006) found that genetic structure in a bush cricket was best explained by landscape pattern 50 years in the past, and Keyghobadi et al. (2005) showed that genetic differentiation among subpopulations of an alpine butterfly corresponded to contemporary forest cover, while genetic diversity within subpopulations was related to the spatial pattern of forest cover 40 years in the past. Finally, we encourage researchers to attempt to account for past landscape patterns, population fluctuations, and other historical influences when interpreting contemporary genetic structures (e.g., Landergott et al. 2001; Vandergast et al. 2007).
Ideally, sampling schemes applied in landscape genetics should reflect a priori hypotheses, and effectively capture landscape heterogeneity with respect to the study goals. Since mismatch of spatial data scales can result in serious errors (Pascual-Hortal and Saura 2007; Wu 2007), landscape genetic studies should evaluate how the spatial configuration of the samples and the spatial resolution of the landscape data influence statistical results and ecological conclusions (Schwartz and McKelvey 2009). Additional research is needed to understand the exact implications of varying scales and sampling configurations for landscape genetic inference and to develop analytical approaches that effectively account for the spatio-temporal dynamics associated with both landscape and genetic data. Future research also needs to evaluate exactly how quickly estimates of genetic variation derived using different genetic markers respond to landscape changes, and identify the factors that influence possible time lags (e.g., population size, genetic diversity, mating patterns, etc.). For now, carefully evaluating properties and assumptions of different data and methods is a first step to ensure accurate and reliable conclusions.
Challenge 2: Analytical limitations
A large variety of statistical methods have been proposed for analyzing the spatial distribution of genetic variation, and for linking observed genetic patterns to landscape characteristics. However, most of these methods were originally developed for other disciplines, and for different kinds of data (e.g., Geffen et al. 2004; Spear et al. 2005; Holzhauer et al. 2006), and may therefore be inappropriate for landscape genetics. For instance, population genetic models of isolation-by-distance as well as spatial statistics such as Moran’s I assume that the environment is homogeneous or at least that gene flow between populations does not depend on the quality of the intervening matrix. At the same time, very few published studies compare the various statistical approaches, or test their reliability under realistic landscape genetic scenarios (Latch et al. 2006; Chen et al. 2007; Balkenhol et al. 2009). It is therefore unclear under which conditions the various methods produce accurate, valid and repeatable results in a landscape genetics context. This is alarming from a scientific standpoint and warrants research that critically evaluates the strengths and weaknesses of current analytical approaches. Such evaluations of methods are required for developing optimal analysis strategies in landscape genetics, for comparisons and meta-analyses across studies, and for designing new and improved statistical techniques. Thus, understanding the advantages and limitations of current analytical approaches is vital for the future progression of landscape genetics.
To recognize and overcome current analytical limitations, several research topics need to be addressed. First, spatially-explicit, individual-based approaches for simulating gene flow in heterogeneous landscapes need to be developed (also see challenge 3). Such simulations will be essential for tracking the complexity of landscape-genetic processes and for creating data with known landscape-genetic relationships needed in method evaluations. Furthermore, simulations will be useful for developing more realistic and robust population genetic models (see challenge 3), as well as for the formulation of appropriate null models for landscape genetic analyses. Currently, most landscape genetic studies are limited to relatively simple null-hypothesis testing, such as testing for the presence of a barrier, rather than comparing the evidence for competing hypotheses involving more complex landscape effects. This may lead to important misinterpretations, as illustrated by Cushman et al. (2006) who found that although simple models of isolation-by-distance or a single barrier to gene flow were statistically significant, models involving land cover and elevation were much better at explaining the observed genetic structure in black bears. Also, most existing population genetic models assume that populations exist as discrete patches or follow simple isolation-by-distance patterns across uniform landscapes. Thus, current population genetic theory is generally not suited for predicting gene flow and genetic structures in heterogeneous environments (but see McRae 2006), especially when analyses involve individuals, rather than populations. Appropriate null hypotheses or meaningful quantitative predictions about landscape effects on genetic patterns are unfortunately rare among published landscape genetics studies.
Until better landscape genetic models are available, we recommend that researchers explicitly state what landscape-genetic mechanisms are assumed in their analyses, rather than testing for correlations and trying to explain significant findings a posteriori. Simple null-hypothesis testing is particularly problematic when measures of genetic differentiation are correlated against connectivity estimates obtained from least-cost or circuit-theoretic analyses. These analyses require the construction of friction grids that reflect the relative, hypothesized resistance of each landscape cell to gene flow. However, landscape resistance to gene flow is actually the result of multiple, interacting factors (e.g., obstacles to movement, local population densities, mortality rates, behavior), and statistical significance does not automatically indicate ecological relevance or causation. Thus, if friction grids are used to test ecological hypotheses about landscape connectivity, they have to be well justified (e.g., based on independent empirical data sources or expert knowledge), or should utilize multiple cost or resistance values (Coulon et al. 2004; Cushman et al. 2006; Epps et al. 2007). Landscape genetics can substantially contribute to studies of population fragmentation and connectivity by explicitly using resistance maps for delineating regional conservation corridors, and by quantifying the degree of expected connectivity between specific areas (Epps et al. 2007; Cushman et al. 2009; McRae et al. 2008).
We also encourage researchers to explore alternatives to simple null-hypothesis testing for landscape genetics (e.g., effect size statistics, Bayesian and information-theoretic approaches; Stephens et al. 2007). Such approaches could help to quantify the relative effects of various landscape parameters on genetic variation (Cushman et al. 2006; Foll and Gaggiotti 2006; Faubet and Gaggiotti 2008), and they could lead to predictive models of genetic population structure and gene flow under different climate and landscape change scenarios. However, measures of genetic differentiation and effective separation distances are mostly pair-wise, and often show spatial autocorrelation. Thus, landscape genetic studies usually rely on non-independent observations, and future research is needed to assess under what circumstances the alternative analytical approaches mentioned above can be used for this kind of data.
Distinguishing between true landscape effects and other, endogenous factors influencing genetic variation will be another major task for future research. Genetic variation can be spatially structured for many reasons (e.g., social interactions, sampling effects, historical influences, etc.), and not all of these factors are landscape-dependent. Thus, statistically separating different variables that influence genetic diversity and structure is crucial for correct ecological inferences (Wagner and Fortin 2005), and should also be considered when designing landscape genetic studies. For example, studies will have greater statistical power to distinguish isolation-by-distance from actual landscape effects when measures of landscape resistance deviate strongly from straight-line distances. Overall, a better integration of spatial statistical approaches with realistic, simulation- and theory-based expectations of landscape-genetic relationships is needed to advance landscape genetic analyses.
Challenge 3: Limited scope of landscape genetics
Even though landscape genetics includes the analysis of adaptive and neutral micro-evolutionary processes (sensu Manel et al. 2003), recent landscape genetic approaches largely focus on describing and mapping populations (e.g., Pritchard et al. 2000; Dupanloup et al. 2002; François et al. 2006) and on identifying factors that influence rates and patterns of gene flow within and between populations (e.g., Coulon et al. 2004; Cushman et al. 2006; Dyer and Nason 2004; McRae and Beier 2007). The rapid expansion of landscape genetic studies has been driven by current needs in conservation biology and natural resource management (Cushman 2006; Storfer et al. 2007). The potential of landscape genetics to address large-scale connectivity questions is particularly important in the face of global climate change coupled with accelerating habitat loss and degradation (Davies et al. 2001; Schlesinger et al. 2001). However, gene flow and population connectivity are relatively narrow topics, and landscape genetics can contribute substantially to other research areas, such as community genetics (Whitham et al. 2006). Landscape genetics also has great potential to expand our understanding of evolutionary processes in spatially complex environments, and could ultimately lead to a theoretical framework for population genetics and evolutionary ecology that appropriately incorporates space and spatial heterogeneity (Holderegger and Wagner 2008).
Ecosystems are the stage on which the play of evolution unfolds. Classic models of population genetics assume implicitly that this stage is uniform, in the form of panmictic populations in a homogeneous environment (Wright 1977). However, ecosystems are not homogeneous, and landscape genetics can contribute to population genetics by elucidating how departure from panmixia in complex landscapes affects population genetic patterns. Furthermore, models in classic evolutionary theory focus on allopatric speciation, in which populations become isolated, gene flow is restricted, and evolution due to drift or local selection creates new species (Futuyma 1997). However, in complex environments the strict allopatric model may not be adequate (Via 2001; Bolnick and Fitzpatrick 2007). Landscape genetics is uniquely suited to explore mechanisms of speciation in a complex resistance landscape, where parts of a population may experience sufficiently reduced gene flow such that drift or selection along locally steep selection gradients could lead to new species (i.e., peripatric speciation; Mayr 1954, 1988). Finally, adaptive landscape genetics explicitly deals with spatial genetic variation under selection, and can be used to study the adaptive and evolutionary potential of populations (Holderegger et al. 2006, 2008). In combination with truly spatial models of selection, landscape genetics offers a powerful framework to model the effects of differential resistance on gene flow and for exploring the influences of spatial patterns and processes in evolution.
Extending landscape genetics beyond evaluations of genetic connectivity will require the exploration of the combined effects of gene flow and selection in complex landscapes. For this, both empirical analysis of selection along environmental gradients and simulation approaches will be critical. Various techniques have been suggested for empirically exploring relationships between selection and environmental gradients to associate patterns in genes under selection with combinations of environmental variables (e.g., Vasemägi and Primmer 2005; Joost et al. 2007; Ouborg and Vriezen 2007; Holderegger and Wagner 2008). These are fertile and exciting approaches, but they also introduce a number of novel challenges that need to be overcome. First, simple matching of a few loci to combinations of environmental variables may produce equivocal results in many cases. For example, a single gene can influence multiple fitness-related traits (pleiotropy), or multiple genes can interact to determine phenotypic expressions (epistasis). Thus, identifying the roles and interactions of multiple genes under selection is a highly complex task, and representing selection gradients spatially in the context of multi-locus traits is a major challenge. We suspect that empirical analysis of patterns of genotypes as functions of selection gradients will be fruitful and critical for evaluation and verification of theoretical expectations.
The main obstacle to integrating spatial processes in classical population genetics is the difficulty of translating the mathematics of ideal, panmictic populations to complex environments. It is likely that closed-form extensions of classic population genetics formulae to spatially complex landscapes are not tractable, due to the difficulty of solving complex differential equations. Instead, spatio-temporal dynamic models could simulate mate selection, genetic exchange, dispersal, and mortality as probabilistic functions of landscape characteristics (e.g., http://landguthresearch.dbs.umt.edu/software.php, http://www2.unil.ch/biomapper/ecogenetics/index.html). For example, the recently developed model CDPOP (http://landguthresearch.dbs.umt.edu/software.php) simulates spatial changes in population genetic data as functions of individual-based movement, breeding and dispersal. The model represents landscape structure as resistance surfaces whose value represents the step-wise cost of crossing each location. Mating and dispersal are modeled as probabilistic functions of cumulative cost across these resistance surfaces. The model is specifically designed to enable explicit quantification of how landscape resistance affects gene flow patterns. Simulations with different resistance grids allow quantification of the effects of different landscape conditions on genetic connectivity and the time required for the spatial patterns of genetic relatedness to equilibrate.
Models such as CDPOP will enable researchers to evaluate classical population genetic predictions in a landscape context. Specifically, they will allow the formal exploration of how drift, selection, heterozygosity and effective population size change as a function of increasingly complex resistance landscapes. Building such integrated, individual-based gene flow and selection models for complex landscapes represents a substantial challenge, and their full implementation will likely require a super-computing environment. However, developing such simulation models is one of the most important future research needs, as it would enable scientists to evaluate and expand classical population genetics predictions, theorems and equations in a spatially heterogeneous context. It would also provide quantitative expectations of gene flow, neutral and adaptive genetic structures, and evolutionary patterns for a wide range of landscapes. In addition, the model would also be beneficial for comparing and improving statistical methods used for correlating environmental and genetic data (see Challenge 2).
Challenge 4: Lack of interdisciplinary communication
Landscape genetics is a highly interdisciplinary research area that combines several complex and rapidly-developing fields. Thus, addressing the identified challenges depends on interdisciplinary collaborations and requires experts from various fields to communicate effectively with each other. Currently, amalgamating expert knowledge from the different fields is hindered by a lack of appropriate communication platforms tailored specifically towards landscape genetics. Researchers involved in landscape genetics are currently not organized in a professional network that could support collaborations among different research groups, and very few scientists have been trained in all disciplines contained in landscape genetics. This makes the exchange of ideas across disciplines particularly challenging.
To ensure continued growth and development of landscape genetics, more training opportunities in landscape genetics and its related disciplines are needed. This training can take many forms ranging from interdisciplinary graduate courses at a single university to collaboratively developed international workshops. Scientific conferences offer unique opportunities for training and exchanging of ideas. They can host landscape genetic symposia, panel discussions, expert-led courses, and working groups to help establish an international network of landscape genetic research groups. Members of this group should meet on a regular basis, and facilitate collaborations by exchanging ideas, data, and (student) researchers. For example, a workshop targeted towards young landscape geneticists was recently funded through the ConGen program of the European Science Foundation (ESF). Similarly, proposals for a landscape genetic working group and a distributed graduate seminar series have recently been funded by the National Center for Ecological Analysis and Synthesis (NCEAS). Alternative funding could be provided by the COST program (European Cooperation in the Field of Scientific and Technical Research), or the US National Science Foundation (e.g., the Partnerships for International Research and Education program).
This paper is a result of such a landscape genetics meeting, and we intend to continue our collaboration. For example, we plan to develop an internet platform for landscape genetics that will include a landscape genetics “wiki”, web-based courses, announcements of meetings and funding opportunities, and a discussion board. Interdisciplinary, international collaborations are particularly challenging, but they will be vital and rewarding for the future development of landscape genetics.
Landscape genetics should actively seek to address the challenges identified in this paper in order to move from the description of spatial genetic structure to statistical quantifications of the effect of landscape pattern on genetic variation, neutral or adaptive, while integrating both empirical and simulation results into a spatial population genetic theory.
While classical population genetics assumes equilibrium, the landscapes and the genetic processes occurring within them are rarely stable or at equilibrium. Spatial genetic structure results from a combination of evolutionary, behavioral, ecological and stochastic processes operating at different spatial and temporal scales. Thus, landscape genetic studies must address the spatial and temporal scale dependence and transient dynamics of population genetic processes when disentangling the effects of different factors on spatial genetic structure.
To accomplish this, researchers should place greater focus on empirical studies that test sets of competing hypotheses in multiple landscapes and on evaluating current and new analytical methods using simulated data. The simulation modeling needed to develop a spatial theory of population genetics is an ideal playground for intensive collaboration between landscape ecologists, who bring experience in the modeling of spatial processes in heterogeneous landscapes, and population geneticists, who can draw on a large body of non-spatial theory and models. Only intensive collaboration will achieve the level of integration between these disciplines that is necessary to address the tasks outlined in this article. Alternative forms of communication, including internet platforms and international interdisciplinary training, will greatly facilitate this ambitious endeavor.
We thank Brad McRae and two anonymous reviewers for valuable comments that helped to improve this manuscript. We also thank the organizers of the 2007 IALE World Congress.