Introduction: genetic variation of forest trees

Genetic variation within most forest tree species is high in comparison to other organisms. Comprehensive reviews of isozyme variation revealed that forest trees are among the most genetically diverse groups of organisms (Hamrick and Godt 1996). Longevity, efficient gene flow mechanisms, and dominance of outcrossing are life history traits promoting the maintenance of high diversity within forest tree species (Austerlitz et al. 2000). Most studies on genetic variation patterns within tree species were primarily motivated by attempts to improve our understanding of biodiversity at the intraspecific level or the evolutionary dynamics within “natural” plant species or populations in an early stage of domestication. Main applications of marker-based diversity studies have been in molecular tree improvement (e.g., Krutovsky and Neale 2005) and conservation of forest genetic resources (e.g., Finkeldey and Ziehe 2004).

The focus of this review is on the use of the rapidly growing “genomic resources” of forest tree species in the context of forensic applications. Specifically, the knowledge of diversity patterns can be used to clarify the origin of populations, single plants, logs, or even processed wood. Forensic applications imply that optimum material for DNA extraction, for example young leaves, may not be available for investigation (Finkeldey et al. 2007a). Thus, DNA extraction from wood and other “difficult” plant tissue is briefly reviewed as a prerequisite for the development of molecular methods to clarify the origin of plant material. Two main fields of application are discussed in more detail: The use of molecular genetic tools (a) to test the proclaimed origin of forest reproductive material and (b) to infer the origin of timber and wood products.

DNA extraction from wood and other plant tissue

The extraction of genomic DNA from soft, living plant tissue such as leaves or needles of conifers is a standard method in plant genetics (Csaikl et al. 1998; Doyle and Doyle 1987). Numerous commercially available kits are used to extract plant DNA in high quality from leaves; these methods will not be reviewed here. More recently, the analysis of hard, dead plant tissue attracted the interest of population geneticists and paleogeneticists. Plant fossils and herbarium specimen are in principle suitable material for the extraction of ancient DNA (aDNA), which can be used to identify plant species or even individual genotypes (Gugerli et al. 2005). Genetic information from fossils and herbaria is useful for phylogenetic and phylogeographic analyses (Savolainen et al. 1995). However, aDNA is often highly degraded (Parducci and Petit 2004), and contamination constitutes a main problem for the analysis of aDNA from plants as also reported for other organisms (Cooper and Wayne 1998) although successful aDNA isolation from ancient wood has been reported in several cases (Liepelt et al. 2006).

Surprisingly, few studies used woody tissue as source material for DNA isolation from living or recently harvested plants. For example, the woody endocarp of the stone fruits of Prunus mahaleb (Godoy and Jordano 2001), the woody pericarp of acorns of oaks (Quercus spp.; Grivet et al. 2005; Ziegenhagen et al. 2003), and the dry wings of the seeds of fir (Abies alba; Ziegenhagen et al. 2003) were used for DNA isolation in order to investigate seed dispersal of trees. A simple and efficient method for DNA isolation from bark tissue based on a modified cetyltrimethylammonium bromide (CTAB) protocol has recently been used for different tree species of woody Leguminosae (Novaes et al. 2009).

The isolation of DNA from wood has been described for oaks (Quercus spp.; Dumolin-Lapègue et al. 1999) and Cyclobalanopsis spp. (Ohyama et al. 2001) of the Fagaceae family.

Several studies proved the usefulness of oak wood as source material for DNA isolation for applications in wood certification and forensics (Deguilloux et al. 2002, 2003, 2004). Important applications of marker-based studies using wood as source material for DNA extraction are evident for tropical forest trees (see below). Methods for extracting DNA from processed and unprocessed wood have been described for the endangered tropical tree species Gonystylus bancanus (Asif and Cannon 2005) and trees of the tropical Dipterocarpaceae family (Rachmayanti et al. 2006). The protocol used for DNA extraction from dipterocarp wood proved to be successful also for numerous trees of the tropical and temperate zone (Rachmayanti et al. 2009).

Most of the published reports on DNA isolation from woody tissue used extraction protocols based on the CTAB method (Doyle and Doyle 1987) or commercially available plant DNA extraction kits. Modifications of protocols used for extraction of DNA from soft, fresh tissues are necessary since wood properties differ from “green” material (Rachmayanti et al. 2009):

  1. 1.

    Physical. The disruption of woody tissue and the release of DNA require a mechanical treatment such as slicing or drilling. Overheating must be avoided since it may result in irreversible degradation of DNA. On the other hand, only DNA from disrupted cell walls can be extracted, and incomplete disruption eventually results in low yield.

  2. 2.

    Chemical. The thick cell walls of wood contain numerous chemical compounds, which are potentially inhibiting DNA extraction. For example, phenolic compounds of the lignin biosynthesis pathway are often strong inhibitors for DNA extraction.

  3. 3.

    Biological. Decomposition of wood by microorganisms and fungi latest starts with the death of the tree. The biological conversion of woody tissue does not only result in degradation of the tree’s DNA but also in a contamination by “foreign” DNA.

  4. 4.

    Aging. The process of DNA degradation starts after the death of the cell. The wooden stem of a living tree consists of a complex mixture of both living and dead cells. In principle, the proportion of living cells is higher in the outer rings containing parenchymatic tissue such as wood rays and decreases toward the center.

All of these characters are highly variable among species. In addition, both genetic and environmental variations influence the physical (e.g., wood density) and chemical characters of wood and the resistance against biological degradation. Variation with regard to levels of DNA degradation and inhibitory substances even occurs within single stems. For example, increasing degradation of DNA but decreasing levels of inhibitory substances were observed in stem disks of tropical Dipterocarpaceae from outer sapwood toward inner heartwood (Rachmayanti et al. 2009). Treatment and processing of wood is expected to further impede DNA extraction, but the effects of different wood treatments such as heating, pressing, or application of pesticides on DNA degradation and extraction methods still need to be systematically explored. Accordingly, optimum methods for DNA extraction differ not only among species but also among different samples of treated or untreated wood from stems of a single species or even an individual tree.

The main modifications of extraction protocols from wood in comparison to fresh tissue refer to careful mechanical disruption of wood, use of protective chemicals, and often considerably increased incubation times. Other relevant aspects are the removal of contaminants from the surface in particular of old tissue, preferable under sterile conditions (Rachmayanti et al. 2006), and repeated DNA elution. The addition of polyvinyl pyrrolidone to the lysis buffer is often effective (Reynolds and Williams 2004) and allows PCR amplification even in the presence of strong inhibitory substances (Rachmayanti et al. 2009).

Extraction of DNA from wood followed by successful amplification of selected DNA regions has been described for several tree species using modified CTAB protocols (Ohyama et al. 2001; Reynolds and Williams 2004; Rogers and Kaya 2006). The Qiagen Dneasy Plant Mini Kit is the most frequently used commercially available extraction kit for DNA extraction from wood. It has been used with some of the modifications mentioned above to extract DNA from ancient wood up to an age of 1,000 years of different European tree species (Liepelt et al. 2006), both ancient and modern oak wood (Dumolin-Lapègue et al. 1999; Deguilloux et al. 2002, 2003), wood of tropical Dipterocarpaceae (Rachmayanti et al. 2006), and various other tree species (Rachmayanti et al. 2009). N-Phenacylthiazolium bromide extraction, a method frequently used for extraction of ancient DNA, proved to be most efficient to extract DNA from wood of G. bancanus (Asif and Cannon 2005).

The quality and quantity of extracted DNA is inferior to DNA extracted from fresh “green” tissue regardless of the chosen extraction method. DNA extracted from wood is partially degraded and rarely free from inhibitory substances. Successful amplification of DNA fragments by PCR is more important for applications than reliable estimates of DNA concentrations and purity using spectrophotometry. Main determinants of successful amplification after DNA extraction from wood are the investigated species, the age of the wood, its processing status, and even the position of the selected material on a stem disk. In addition, the choice of the DNA region to be amplified is of crucial importance. The presence of DNA in numerous copies in single cells increases success rates. For example, DNA of plastids is expected to be more easily amplified than nuclear single-copy DNA. Even though wood cells do not contain chloroplasts, “chloroplast DNA” is present in numerous copies in wood cells containing other plastids such as amyloplasts (Deguilloux et al. 2002). Numerous studies proved successful amplification of “chloroplast DNA” (cpDNA) from wood (e.g., Deguilloux et al. 2003; Liepelt et al. 2006). Since DNA extracted from wood is at least partially and often highly degraded, the size of DNA fragments is another crucial factor influencing amplification success: The shorter a fragment is, the higher is the chance of successful amplification.

Identification of informative markers

The genome size of trees is highly variable. The genome of the first completely sequenced tree species, Populus trichocarpa, has a length of approximately 500 × 106 bps. Thus, it is roughly 1/6 of the length of the human genome, although it contains more genes (Tuskan et al. 2006). The genome of conifers is particularly large; for example, the genome of most pines (Pinus spp.) is larger than 25,000 × 106 bps (Ahuja and Neale 2005). Accordingly, an enormous amount of nuclear sequences is potentially available to select informative markers.

Typical forensic applications of molecular markers require the assignment of a questionable sample to a single genotype or a group of organisms. Accordingly, both the overall amount and the spatial distribution of variation determine the usefulness of particular DNA fragments for forensic applications such as the identification of the origin of material. Since the functional importance of the observed variation is of minor or no concern in this context, noncoding regions are frequently used targets for identification purposes in plants as in animals and humans (Hummel 2003).

Plant cells contain DNA in the nucleus (nDNA), mitochondria (mtDNA), and chloroplasts (cpDNA). Nuclear DNA is often highly variable (see above) and is biparentally inherited. Efficient gene flow in particular via pollen is the main factor contributing to high diversity within populations of trees but low differentiation among spatially separated populations (Finkeldey and Hattemer 2007). The proportion of the total diversity which is due to differentiation among tree populations (F ST or its analogs) is often low at biparentally inherited markers such as isozymes (Hamrick and Godt 1989) or nuclear microsatellites. Accordingly, variation of a small number of nDNA markers is suitable to establish the identity of genetic information and to assign samples to a particular (known) genotype but only rarely allows to reliably identify the origin of plant material such as wood unless the spatial distribution of genotypes of all possible source plants is known. Encouraging results were reported for population assignment procedures based on multilocus genotypes using Bayesian clustering methods (Pritchard et al. 2000) even if differentiation at most single markers is only moderate. The development of cost-efficient genotyping methods based on high-throughput sequencing (Binladen et al. 2007) is likely to greatly improve possibilities to assign samples to heterogeneous, poorly differentiated groups based on multilocus genotypes at least for intensively studied “model” species.

Uniparental inheritance is the rule for extranuclear DNA in plants. The common mode of inheritance is maternal, i.e., via seed parents only, for mtDNA. Maternal inheritance is also the rule for cpDNA of angiosperms, but gymnosperms including conifers are typically characterized by paternal inheritance of cpDNA (Neale and Sederoff 1989). While genetic variation is more conserved in cp- and mtDNA in comparison to nDNA, differentiation among populations (F ST) is often much higher in particular for maternally inherited markers since the dispersal of genetic information via seed trees only (maternal inheritance) is much less efficient than dispersal via pollen and seeds (biparental inheritance) in plants. Thus, variation patterns of maternally inherited cp- or mtDNA haplotypes are often suitable for phylogeographic studies and hence useful to distinguish the origin of trees on a large geographic scale (Schaal et al. 1998). Phylogeographic patterns within species depend on their population history. For example, postglacial migration shaped the variation of cpDNA of numerous woody species in Central and Northern Europe (Petit et al. 2003). The most detailed information on range-wide distribution patterns for a common, widely distributed taxon is available for European oaks (Quercus spp.; Petit et al. 2002). Spatial patterns of genetic variation of main plantation species have been considerably modified by humans due to long-distance transfer of forest reproductive material (Gailing et al. 2007; Pandey et al. 2004).

Applications

The rapidly increasing knowledge about genetic variation patterns within and among woody plant species offers interesting opportunities for applications as illustrated by two examples below. The potential for forensic applications is further enhanced by improved DNA extraction methods from wood and other “difficult” plant tissue.

The origin of forest reproductive material

Man-made planted forests play a globally increasing role for the production of wood, other forest products, and environmental services since natural forests are diminishing in most regions (Brown and Ball 2000). The success of plantation establishment critically depends on the use of appropriate reproductive material (seedlings, seeds, or vegetative propagules such as cuttings) adapted to the plantation sites. Recognizing the importance of the origin of forest reproductive material for plantation success, most countries regulate its marketing by legal rules (Nason 2001). Well-defined varieties of trees are rarely available unlike for agricultural crops. Progenies used for plantation establishment are mostly harvested in natural forests or plantations mainly selected based on phenotypic criteria and raised in nurseries. Limited availability of seeds from selected stands in certain regions and different costs for the harvest and processing of seeds are main reasons for violations against legal rules and the use of unselected material.

Main factors impeding a speedy and reliable identification of the origin of forest reproductive material are the long lifetime of trees, high genetic diversity within most populations, and the lack of well-defined varieties which can be easily identified. Seed harvest, processing and storage of seeds, raising of seedlings in nurseries, and planting often involve long-distance transportation. Proper documentation is essential during all production steps from seed harvest to plantation establishment but does not rule out mislabeling and erroneous declarations concerning the origin of reproductive material. Violations against legal rules concerning forest reproductive material have occasionally been reported in the past, and false declarations of the origin of material appear to be common. Accordingly, methods to control the production of forest reproductive material and specifically to test the reliability of the declared origin of seedlings raised in nurseries are highly desirable.

Molecular genetic markers have been proposed as suitable tools to control the origin of forest reproductive material in Germany (example 1). The objective of the use of markers is to confirm the origin, i.e., the harvesting places, of reproductive material raised in nurseries prior to or even after planting in the forest.

Example 1. Molecular methods to control the origin of forest reproductive material in Germany

Two systems have been implemented in Germany to enhance the traceability of forest reproductive material: The Certification Scheme for Tracing the Origin of Forest Reproductive Material in South Germany (ZüF; http://www.zuef-forstpflanzen.de/) and the Association for Forest Reproductive Material (FfV e.V. (ISOGEN) method; http://www.isogen.de/). Molecular markers play a central role to improve traceability in both methods. Samples are taken throughout the production process of forest reproductive material from harvesting and seed processing to the raising of seedlings in nurseries. This allows to investigate and to compare samples taken at different stages of the production process. Samples are centrally stored and investigated at different molecular markers. To reduce costs and increase efficiency, only a subset of all samples is randomly selected and analyzed by molecular methods.

The choice of the most appropriate molecular method to trace the origin largely depends on the species and the current knowledge about spatial distribution patterns of genetic diversity. Since fresh, living seed or plant material is available, not only DNA markers but also biochemical methods may be used for investigation. For example, isozymes have been used to check the origin of beech (Fagus sylvatica) reproductive material (Konnert and Hussendörfer 2002). The detailed knowledge about phylogeographic variation patterns of cpDNA haplotypes in European oaks (Quercus spp.; Petit et al. 2002) makes cpDNA markers excellent tools to infer the origin of oaks (Gailing et al. 2003; Gailing et al. 2007) and to identify misclassified oak seedlings. Examples of the use of cpDNA variation in oaks to support or to reject the identity of small population samples are shown in Fig. 1. The observation of identical haplotypes in comparable proportions in harvested seeds before and after transport and seed storage strongly suggests the maintenance of identity (Fig. 1a). The observation of cpDNA haplotypes (haplotypes 4 and 5) in seeds after transport and storage which were not observed in seeds sampled directly in the forest during the time of harvest indicates a false declaration of the origin of the seeds (Fig. 1b). The example illustrates the potential of strongly differentiating markers to identify false declarations even with small sample sizes.

Fig. 1
figure 1

CpDNA haplotypes (nomenclature according to Petit et al. (2002); n number of observations) in acorns of oaks (Quercus spp.) claimed to originate from the same harvesting operation in two different populations (a P1, b P2) before and after transport and storage of seeds (Leinemann, unpublished)

The origin of wood and wood products

Illegal logging is one of the main factors contributing to forest destruction (Brack 2003). A large share of internationally traded timber and wood products originates from illegal logging. Traceability of wood and wood products throughout the chain-of-custody from producers to consumers will greatly enhance the use of legally produced timber and contribute to sustainable forest management. Forest certification schemes, state agencies such as customs offices, forest enterprises producing timber according to the principles of sustainability, and environmentally conscious consumers profit from reliable methods improving the traceability of timber and offering opportunities to identify false declarations of the origin of timber. Molecular genetic markers have been suggested as potentially suitable tools to identify false declarations due to the great stability of DNA and the impossibility to manipulate the DNA contained in dead plant tissue (Finkeldey et al. 2007b).

Two basic requirements need to be fulfilled to use DNA markers for tracing the origin of wood or wood products: DNA needs to be isolated from woody tissue, and informative markers need to be identified and investigated (see above). A first example on the application of molecular markers to trace the geographic origin of wood was provided within the context of the cooperage industry in France: Chloroplast DNA isolated from oak (Quercus spp.) wood has been used to identify false declarations on the origin of wood used for the production of barrels for wineries in France (Deguilloux et al. 2004).

The development of reliable and efficient tracing methods is most urgent for forests and forest tree species particularly affected by illegal logging activities. Tropical forests are threatened since their area is globally declining and since they contain numerous endangered species. Molecular markers are used to identify logs of the endangered tropical tree genus Intsia (Merbau). Hypervariable microsatellite markers (Wong et al. 2009) are used to genetically fingerprint the rootstock of harvested trees and to establish the genetic identity between rootstocks and harvested logs. Encouraging results have also been reported for the important tropical tree family Dipterocarpaceae (example 2).

Example 2. Identification of the species and the origin of wood of Dipterocarpaceae

Dipterocarps (family Dipterocarpaceae) are a very specie-rich group of tropical trees with a pantropical distribution and a pronounced center of diversity in Southeast Asia. The family dominates many lowland forests in Southeast Asia and is an important source of tropical timber. Dipterocarps have been rarely planted in the past. Examples of trade names are meranti and balau for Shorea spp., keruing for Dipterocarpus spp., and kapur for Dryobalanops spp.

Simple but efficient methods to extract DNA from wood of dipterocarps were developed (Rachmayanti et al. 2006). Amplification of fragments was achieved after DNA extraction from wood with an average success rate of more than 75%. Successful amplification of fragments below 200 bps was achieved in more than 90% after extraction of DNA from 406 samples of processed and unprocessed wood of numerous species. Success rates dropped below 80% for a cpDNA fragment of approximately 600 bp (trnL) and below 58% for another, longer cpDNA fragment (trnLF; approximately 1,100 bp). Another important factor for amplification success is the processing status of wood. Amplification success is much lower if processed wood is used for DNA extraction in comparison to the use of untreated wood.

Variation of cpDNA is suitable to identify dipterocarp species (Indrioko et al. 2006). Since many dipterocarp species are endemic, the identification of these species restricts the potential places of origin of timber to their small natural distribution area. Low within-species diversity of cpDNA of most dipterocarps and incomplete differentiation impede the development of informative markers to distinguish different regions of origin for common, widespread species. Strong differentiation was found among populations of the most common dipterocarps in Indonesia, Shorea leprosula and Shorea parvifolia, at a small number of anonymous AFLP fragments (Cao et al. 2006). Some of these fragments were successfully converted to sequence-characterized amplified region (SCAR) markers showing strong differentiation among regions. For example, a SCAR marker was developed to unambiguously distinguish between samples of S. leprosula from Borneo and from Sumatra (Fig. 2; Nuroniah 2009).

Fig. 2
figure 2

Complete differentiation between populations of S. parvifolia from the islands of Sumatra and Borneo (n = 159 plants from 14 populations) at a SCAR marker (modified from Nuroniah 2009). Distinction of types ‘0’ and ‘1’ visualized by the PCR-RFLP method (2a; L, 100-bp ladder; P, positive control; N, negative control)

Outlook

Improved DNA extraction protocols for “difficult” plant material such as wood and a better understanding of spatial patterns of genetic variation for numerous tree species frequently allow to conclude on the origin of living plants or dead tissue by an investigation of specific DNA regions. These recent developments offer numerous potential applications in forensics; the most important field of application is the identification of timber species and the origin of wood and wood products based on DNA markers. It is already now possible to reliably detect false declarations concerning the origin of illegally harvested timber using molecular genetic tools, albeit only for a small fraction of relevant species and problems.

Further studies are needed to modify existing DNA extraction protocols for processed wood. However, mechanical disruption of wood, heating, pressure, application of glues or other chemicals, and other treatments may result in almost complete degradation of DNA limiting the applicability of DNA-based methods to identify the origin of wood products.

The development of informative markers is time-consuming and costly. Species identification has important applications to control the trade with protected plants and plant products and to identify the origin of wood of endemic species. Progress in plant barcoding (Chase and Fay 2009) will greatly enhance the use of DNA variation for species identification. Considerable efforts are needed to develop information sources containing relevant phylogeographic data to distinguish different origin regions of trees at least for the most important species. Cost- and time-saving high throughput sequencing technologies (Binladen et al. 2007) will enhance the rapid development of these databases. However, species-specific limitations of the maximum possible spatial resolution exist due to the exchange of genetic information within and among populations in the past and present.