Background

Alu elements are a family of primate-specific SINEs (Short INterspersed Elements) of approximately 300 base pairs (bp) long and present in the genomes of all living primates [13]. Alu elements were derived from 7SL RNA, the RNA component of the signal recognition particle, in the common ancestor of all living primates [4]. In the past approximately 65 million years Alu elements have become widely distributed in primate genomes [1, 5]. Alu elements are now present at copy numbers of >1,000,000 in all surveyed great ape genomes (Additional file 1) [1]. Despite their high copy number the majority of Alu elements are genomic fossils, non-propagating relics passed down over millions of years after earlier periods of replicative activity [1, 6]. It is hypothesized that a relatively small number of ‘master’ elements are responsible for the continued spread of all active subfamilies [7, 8].

As non-autonomous retrotransposons, Alu elements do not encode the enzymatic machinery necessary for self-propagation [1, 2]. This is accomplished by appropriating the replication machinery [2, 9] of a much larger, autonomous retrotransposon called LINE1 (L1) via a process termed target-primed reverse transcription (TPRT) [1013].

The effective use of SINEs as phylogenetic markers was first demonstrated in 1993 in a study seeking to resolve relationships between Pacific salmonid species [14]. Subsequent to this study, SINE-based phylogenetic methods have been applied across a wide range of species to determine evolutionary relationships [15, 16]. In particular, Alu elements have proven to be extremely useful tools for elucidating evolutionary relationships between primate species [1, 17]. The essentially homoplasy free presence of an Alu element of the same subfamily at a given locus between two or more primate species is almost always definitive evidence of shared ancestry [18]. The possibility of confounding events is very small, and easily resolved by the sequencing and examining of the element in question [1, 18]. In the past 15 years Alu-based phylogenetic methods have been used with great success to resolve evolutionary relationships among the Tarsiers [19, 20], New World [21] and Old World monkeys [2224], gibbons [25], lemurs [26, 27], and great apes [28].

In addition to phylogenetic applications Alu elements also function as effective markers for the study of population genetics via examination of polymorphic elements between members of the same species [2, 29, 30]. Alu elements are also linked to numerous genetic diseases, and the insertion of an element at an importune genomic location can have grave consequences for the individual involved [3, 31, 32]. Additionally, Alu elements are thought to be a causal factor in genomic instability [3336].

Alu elements are classified in multiple major subfamilies and numerous smaller, derivative subfamilies based on specific sequence mutations [3740]. All extant primates share older elements, while all primate lineages examined also have younger, lineage-specific subfamilies [41]. Alu subfamily evolution is parallel, not linear, and various subfamilies have been found to be actively retrotransposing at the same time in all primate genomes surveyed; each primate lineage thus possesses its own Alu subfamilies [1, 42, 43].

The Alu J subfamily is the most ancient Alu lineage, and was largely active from approximately 65 million years ago to approximately 55 million years ago, at which point Alu S evolved and supplanted Alu J as the predominant active subfamily [37, 41]. Due to the antiquity of the lineage, Alu J subfamilies are present in all extant primates, including Strepsirrhines [27, 44]. Alu S, on the other hand, evolved from Alu J after the Strepsirrhine-Haplorrhine divergence, and so is only found in New World and Old World primates [2, 37, 45]. The Alu Y subfamily subsequently evolved from Alu S in the Old World primate lineage, and remains the predominant active subfamily in catarrhines [1, 41, 45].

A number of Alu Y-derived subfamilies continue to be active in great apes [1], and polymorphic lineage-specific Alu elements have been well documented between existing human populations [2], indicating a continued activity level for these mobile elements. A rate of one new element in every approximately 20 live births has been proposed as the current rate of Alu element activity in the extant human population, but the large size of this population coupled with human generation time would make it very difficult for new elements to come to fixation outside of small population groups [46, 47]. Research into Alu element activity in Sumatran and Bornean orangutans has indicated a comparatively low-level of continued retrotransposition activity in these apes [48], suggesting some alteration of the propagation of Alu within this lineage [49].

The western lowland gorilla (Gorilla gorilla gorilla), a subspecies of the western gorilla (Gorilla gorilla), is a critically endangered great ape endemic to the forests and lowland swamps of central Africa [50, 51]. Western lowland gorillas are gregarious, living in family groups comprised of a dominant male, multiple females, subadult males, and juvenile offspring [52]. Western lowland gorillas are in danger of extinction due to human activity. Their wild population size is shrinking in the face of anthropogenic pressure and diseases such as Ebola [50]. Gorillas are a close evolutionary relative of humans and the Pan lineage of chimpanzees and bonobos, with the most widely accepted date for a common ancestor 6 to 9 million years ago [28, 5355], though a date as early as 10 million years ago has been recently proposed [56].

The genome of ‘Kamilah’ , a female western lowland gorilla living at the San Diego Zoo, was initially assembled from 5.4 Gbp of capillary sequence and 166.8 Gbp of Illumina read pairs, and further refined using bacterial artificial chromosome (BAC) and fosmid end pair capillary technology [57]. This sequence is available from the Wellcome Trust-Sanger Institute.

Previous analyses of Alu elements in gorillas have been limited to analysis in the context of wider research projects [28, 5861] and have not focused specifically on subfamily analysis. Here we examine the western lowland gorilla genome (build gorGor3.1) [57] to identify gorilla-specific Alu Y subfamilies and assess the activity levels, copy number, and age of these subfamilies. Our final analysis resulted in the identification of 1,075 Gorilla specific Alu element insertions.

Results and discussion

Computational examination of the western lowland gorilla genome

A total of 1,085,174 Alu elements were identified in the genome of the western lowland gorilla (Additional file 1). Of these, 286,801 were identified as belonging to the ancient Alu J subfamily, and 599,237 were identified as members of the Alu S subfamily. A total of 57,427 elements were too degraded or incompletely sequenced to be assigned a subfamily designation by RepeatMasker, and were simply identified as ‘Alu’. We identified 141,709 members of the Alu Y subfamily. This subfamily is of particular interest due to its relatively young age and known continued mobility in other great ape genomes [1, 62]. Approximately one-third (57,458) of these putative Alu Y elements were >250 bp in length. Gorilla-specific elements were subsequently identified by comparison of orthologous loci in the genomes of human, common chimpanzee, and orangutan [63]. Putative unique, gorilla-specific Alu Y insertions were estimated at 4,127 copies. This number is similar (96.5%) to the 4,274 gorilla-specific Alu elements identified using other approaches [58]. Individual examination demonstrated that the majority of our 4,127 loci were in fact shared insertions. These loci were manually examined for gorilla specificity using BLAT [64]. This manual examination excluded 2,858 loci from further analysis due to the presence of shared insertions missed by Lift Over (2,626 insertions) or the lack of orthologous flanking regions in the genomes of other species that preclude PCR verification (232 insertions). This resulted in a total of 1,269 likely gorilla-specific Alu insertion loci for inclusion in subfamily structure analysis.

These 1,269 loci were analyzed for subfamily structure using the COSEG program. COSEG removed 194 probable gorilla-specific Alu insertions from the dataset due to the presence of truncations or deletions in diagnostic regions of the element, leaving 1,075 probable gorilla-specific Alu insertion loci for further analysis Additional file 2. COSEG then divided the loci into 10 subfamilies based on diagnostic mutations in the sequence of the individual Alu elements and provided subfamily consensus sequences (Figure 1) [43]. The consensus sequences were then aligned with known human Alu Y subfamilies from the RepBase database of repetitive elements [65] (Figure 2). A gorilla-specific nomenclature system was created to designate subfamilies using the suffix ‘Gorilla’ preceded by the subfamily affiliation based on a comparison to identified human subfamilies (for example, ‘Alu Yc5a1_Gorilla’). Subfamilies were named in accordance with established practice for Alu subfamily nomenclature [41]. The first identified Alu Yc5-derived subfamily was, for example, designated Alu Yc5a3_Gorilla. The ‘a’ denotes the fact that this is the first Yc5-derived subfamily identified. The ‘3’ denotes the number of diagnostic mutations by which this gorilla-specific subfamily differs from the human Alu Yc5 consensus sequence [41]. Subfamily age estimates were calculated using the BEAST (Bayesian Evolutionary Analysis by Sampling Trees) program [66].

Figure 1
figure 1

Analysis of gorilla-specific Alu subfamilies. (A) A schematic diagram of a tree of evolutionary relationships of the four genera in Family Hominidae (great apes) based on divergence dates of 6 to 9 million years ago for the Gorilla-Homo/Pan speciation event [28, 5355]. (B) A pie chart showing a color-coded distribution of Gorilla-specific Alu Y subfamilies. Alu Y_Gorilla is the largest subfamily, representing slightly less than three-fourths of the total copy number identified. (C) A stepwise analysis of the relationships between Gorilla-specific Alu Y subfamilies generated from a Network analysis of the consensus sequences for each subfamily. The color of the dots representing each subfamily are correlated with the colors in the pie chart in Figure 1B.

Figure 2
figure 2

Alu sequence alignment. The consensus sequence for the Alu Y subfamily is shown at the top, with western lowland gorilla-specific Alu subfamilies listed below. The dots below the consensus denote the same base with insertions and deletions noted by dashes and mutations with the appropriate bases. The consensus sequences for the Alu Ya1, Alu Yc1, and Alu Yc5 subfamilies included for comparative purposes. Subfamily-specific diagnostic mutations are highlighted in yellow. Lineage-specific deletions are highlighted in red. Alu Y_Gorilla is 100% identical to the Alu Y consensus sequence. The shared 12-bp deletion identifying the Alu Yc5-derived Gorilla subfamilies is located at position 86. The 16-bp and 20-bp deletions identifying the Alu Y16_Gorilla and Alu Y16a4_Gorilla subfamilies are visible at positions 228 and 232.

Alu Y subfamily activity in the western lowland gorilla genome

Computational and PCR analysis of the western lowland gorilla genome has identified 1,075 independent, gorilla-specific Alu Y insertion loci. Computational analysis of this dataset indicates the presence of 10 distinct subfamilies identifiable by the presence of diagnostic mutations specific to each lineage. The 1,075 elements identified in our study almost certainly do not represent the total number of Alu Y specific to western lowland gorilla genome. Any loci under our arbitrary length of >250 were excluded from our dataset. It is also likely that a number of Alu Y loci are located in portions of the genome where sequence data is incomplete; within repeat regions, for example. Additionally, some Alu Y loci were excluded when no orthologous genomic region was present in the species being used for comparison.

The largest newly identified gorilla-specific Alu subfamily was designated as Alu Y_Gorilla. This designation was established via computational evaluation and manual alignment of the 759 elements assigned to this subfamily. The consensus sequence for these elements was found to be 100% identical to the canonical Alu Y human consensus sequence (Figure 2). This subset of classic Alu Y elements continued to propagate in the Gorilla lineage after the divergence from the shared common ancestor with the Homo-Pan lineage. We assayed and verified a total of 135 loci from this subfamily via PCR (18%). The 43 elements belonging to the Alu Ya1_Gorilla subfamily differ from the Alu Y consensus sequence by one diagnostic mutation at nucleotide position 133. We assayed and verified via PCR 21 elements in this subfamily (49%). This sequence should not be confused with the Homo-Pan Alu Ya subfamily.

The Alu Ya1b4 subfamily is derived from Alu Ya1_Gorilla and is a small and very likely young subfamily of 13 elements that shared the diagnostic mutation at position 133 of Ya1 but has also accrued four additional diagnostic mutations. We assayed and verified via PCR seven elements in this subfamily (54%). A second identified Alu Y lineage in gorilla is the Alu Yc3_Gorilla subfamily. We assayed and verified via PCR 20 of the 69 elements in this subfamily (29%). The consensus sequence for the 69 members identified in this subfamily is a 100% match to the human Alu Yc3 subfamily consensus sequence (Figure 2).

Two additional gorilla-specific Alu Yc-derived subfamilies share the characteristic 12 bp deletion at positions 87–98 that is a hallmark of human Alu Yc5. These two subfamilies possess independent diagnostic mutations that make them distinct from the Alu Yc5 consensus sequence. These two subfamilies are designated as Alu Yc5a3_Gorilla (55 elements identified) and Alu Yc5b2_Gorilla (46 elements identified). Alu Yc5a3_Gorilla has three additional diagnostic mutations differentiating it from the Alu Yc5 consensus as a mark of identification. In keeping with Alu subfamily naming convention this subfamily has thus been deemed ‘Yc5a3’, ‘a’ as the first Yc5-like subfamily identified in the gorilla genome and ‘3’ for the three diagnostic mutations differentiating it from the canonical Yc5 consensus. We assayed and verified 27 members of this subfamily via PCR (49%). Alu Yc5b2 also shares the characteristic 12 bp deletion of the human Alu Yc5, but has two independent diagnostic mutations (Figure 2). We assayed and verified via PCR 19 members of this subfamily (41%). It is probable that Alu Yc5a3_Gorilla and AluYc5b2_Gorilla derived from Alu Yc5 around the time of the Homo/Pan-Gorilla speciation event.

A third lineage nearly identical to human Alu Yb3a2 was identified as Alu Yb3a2b2_Gorilla (25 elements identified). This Alu subfamily contains two additional diagnostic mutations. Termed Alu Yb3a2b2_Gorilla, this lineage is an independent evolution in the Gorilla gorilla gorilla genome and not a derivative of the human-specific Alu Yb3a2. The Alu Yb lineage is human specific, meaning any identical or apparently derived Alu lineages in other primate genomes must be examples of independent evolution [67]. This is confirmed by the lack of orthologs at the same location in the human genome. We assayed and verified 14 members of this subfamily via PCR (56%). An additional subfamily present at only 17 copies and derived from Alu Yb3a2b2_Gorilla was identified and termed Alu Yb3a2b2a2_Gorilla, due to two diagnostic mutations separating these otherwise identical subfamilies. We assayed and verified via PCR nine elements in this subfamily (53%). The low copy number of these subfamilies coupled with their lack of impairing point mutations, even with the caveat that some subfamily members may have been overlooked, leads us to posit that they are among the youngest and potentially still active subfamilies in the western lowland gorilla genome.

Two additional subfamilies were identified that, while clearly Alu Y derived, do not follow the consensus sequences of established subfamilies available via RepBase. The first, termed Alu Y16_Gorilla is identified clearly by the presence of an A-rich insert at position 219 followed by a 16 bp deletion, and is present in 30 copies. We assayed and verified via PCR 10 members of this subfamily (33%). The second subfamily, apparently derived from the first and designated Alu Y16a4_Gorilla, is present in 18 copies and can be distinguished from Alu Y16_Gorilla by a 20 bp deletion occurring after the A-rich region at position 219. Seventeen elements from this subfamily were assayed via PCR (94%), with 100% of these 17 being verified as gorilla-specific. One locus (gorGor3.1 chrX:74544052–74544324) lacked sufficient orthologous 5′ sequence in non-gorilla outgroups to successfully design a working primer, but was included in the total count based on computational verification. The accumulation of non-diagnostic mutations in these two subfamilies may indicate that they are more ancient.

Approximately 25% of the 1,075 gorilla-specific Alu Y elements computationally identified in this study were verified by PCR, with the remaining approximately 75% verified by manual examination of computational data. It is important to note that we had no false positives in this study, and 100% of the elements computationally identified as gorilla-specific that were subsequently assayed via PCR were confirmed to be, in fact, gorilla-specific.

One means of identifying potential master elements [7] is to look for subfamily members with mutation-free polyA-tails [68]. A possible source element for the Alu Y_Gorilla subfamily, for instance, was identified on chrX:5135584–5135921, with a mutation-free 30 bp polyA-tail and intact promoter region. A posited source element for the Alu Yc5b2 subfamily was identified on chr9:17925753–17926051, also with a mutation-free 30 bp polyA-tail and intact promoter region.

Alu Y retrotransposition rates appear to be lower in the western lowland gorilla genome than in the human or chimpanzee genomes [69], while higher than that seen in the orangutan genome [48, 49]. Factors influencing rates of retrotransposition are myriad [1, 46]. Active retrotransposons are frequently polymorphic within a population, and are easily lost during events like speciation or population bottlenecks [70, 71]. The number of active elements, and the amplification rate of elements surviving such an event, can vary greatly and impact overall retrotransposition activity in the host genome.

A possible explanation for this lower activity level include inhibition of retrotransposition in the Gorilla lineage by the interaction of host factors such as members of the APOBEC family of proteins with the enzymatic machinery of L1 [1, 72]. Interference with L1 and Alu retrotransposition by APOBEC has been documented [7274]. Analysis of the activity level of Gorilla-specific L1 elements could elucidate this, but has not yet been done. Additionally, environmental stress factors may impact retrotransposition rates [75]. It is possible that one or a combination of these retrotransposition-inhibiting factors could be responsible for the lower level of Alu Y activity in the western lowland gorilla genome.

A median joining tree of relationships between gorilla-specific Alu Y subfamilies was generated from a stepwise alignment [76] using the Network program (Figure 1) [42, 77]. The tree generated supports the divergence of all gorilla-specific subfamilies from the Alu Y_Gorilla subfamily, and analysis of subfamily ages using BEAST places the date for this subfamily divergence at the stem of the Gorilla lineage. Initial divergence of gorilla-specific subfamilies occurred shortly after the speciation event separating the Gorilla lineage from the Homo-Pan lineage 6 to 9 million years ago [28, 5355], and master elements have continued to produce copies of each subfamily at varying rates since [7].

Divergence dates of gorilla-specific Alu Y subfamilies

BEAST analysis of individual subfamily ages suggests no delay or change in transposon activity in western lowland gorilla following the divergence of the Gorilla and Homo-Pan lineages. The age of the gorilla-specific lineages ranges from 6.5-6.71 million years ago based on a baseline divergence of 7 million years ago for the most recent common ancestor of Gorilla and Homo-Pan. This indicates that all of the identified subfamilies originated around the time of the speciation event that separated these two lineages. This result is consistent with the ongoing propagation of these subfamilies before, during, and after the speciation event at a relatively constant rate. This indicates that the ‘master genes’ [7] from which these subfamilies are derived already existed and were retrotranspositionally active prior to the aforementioned speciation event, and have remained active subsequently. Examination of Alu elements indicates retrotranspositionally active elements are relatively rare, and that most Alu activity is the result of a small number of ‘master’ copies engaging in retrotranspositional activity over time [7]. Our results suggest that the 10 gorilla-specific Alu Y subfamilies identified in this study diverged and are still diverging from master elements already present in the genome of the common ancestor of the Gorilla and Homo-Pan lineages. A table listing each subfamily, the ‘master gene’ or ancestral Alu subfamily from which it was likely derived, the % divergence from the consensus sequence of the master element, copy number, and suggested age of the most recent common ancestral element are available in the Additional files section of this paper as Additional file 3.

Conclusions

Alu Y subfamily activity appears to be greatly reduced in the western lowland gorilla genome when compared to the human and chimpanzee genomes. The level of activity seen, while not as low as that observed in the genome of the orangutan, is consistent with a change in host surveillance or intrinsic retrotransposition capacity. Alu subfamily age estimates provide further support for the master gene model of Alu retrotransposition with a relatively low number of ancient lineages responsible for ongoing retrotranspositional activity. The 1,075 lineage specific Alu Y insertion loci and the 10 subfamilies identified should provide future researchers with a rich source of genetic systems for conservation biology and evolutionary genetics.

Methods

Computational methodology

The genome of the Western lowland gorilla (Gorilla gorilla gorilla) was downloaded and analyzed for the presence of Alu elements using an in-house installation of the RepeatMasker program [62]. The Gorilla gorilla gorilla genome is available for download and analysis via the website of the Wellcome Trust-Sanger Institute [78]. The resulting dataset was parsed into separate files based on the Alu subfamily designations assigned by RepeatMasker. The file containing elements designated as members of the Alu Y subfamily was then further parsed to remove 84,251 elements under the length of 250 bp using the estimation that shorter elements were likely to be older elements present in multiple species and therefore not useful for our analysis. The ‘Fetch Sequences’ function from the online version of the Galaxy suite of programs [63, 7981] was then used to retrieve the individual DNA sequence present at each of these loci using the gorilla genome build gorGor3.1, and the Lift Over function was used to examine these loci for gorilla lineage specificity by comparison to the closely related genomes of human (Homo sapiens; hg19), chimpanzee (Pan troglodytes; panTro2), and Sumatran orangutan (Pongo pygmaeus abelii; ponAbe2). An additional 200 bp of flanking sequence on each side of the loci assayed was included in this analysis for validation of orthologous loci between the nine primate species considered in this study (Table 1).

Table 1 DNA sample data of all species examined in this study

Loci selected for verification were examined for further evidence of gorilla-specificity using the BLAST-Like Alignment Tool (BLAT) available at the UCSC Genome Browser website [82]. Putative gorilla-specific loci were compared to the available genomes of three other primate species, human (hg19), chimpanzee (panTro2), and orangutan (ponAbe2) [64, 83]. Elements found to be absent in these species and with sufficient orthologous flanking across species were marked for PCR primer design and experimental validation. Loci determined to be shared insertions, as well as those lacking sufficient orthologous flanking for effective primer design, were removed from further consideration [64].

The COSEG program [84], designed to identify repeat subfamilies using significant co-segregating mutations, was then run on the remaining putative gorilla-specific insertions to identify and group specific subfamilies together. COSEG ignores non-diagnostic mutations during analysis, providing an accurate representation of relationships between subfamilies of elements by ignoring potentially misleading mutational events [43]. COSEG uses a minimum subfamily size of 50 elements as the default setting. We arbitrarily defined subfamilies as groups of >10 elements to increase the detail of subfamily structure resolved. A subset of a minimum of 10% of each identified subfamily was then chosen for verification using locus-specific PCR, with a total of 279 loci assayed and verified (Figure 1).

A multi-species alignment comprised of the species listed above was created for each locus using BioEdit [85]. Oligonucleotide primers for the PCR assays were designed in shared regions flanking each putative gorilla-specific element chosen for experimental verification using the Primer3Plus program [86]. These primers were then tested computationally against available primate genomes using the in-silico PCR tool on the UCSC Genome Bioinformatics website [83].

Subfamily age estimates were calculated using the BEAST program [66, 87]. BEAST has previously been used to estimate dates of divergence using transposon data [88]. For each subclade, the consensus sequence for each subfamily was determined from the COSEG output [43]. The progenitor element was determined by RepeatMasker analysis of each consensus sequence. Elements were aligned using the SeaView software program and MUSCLE algorithm [76, 89]. The progenitor element was then used as an out-group to root the tree of each subclade. BEAST was calibrated with a baseline divergence date of 7 million years ago for the split between the Gorilla and Homo-Pan lineages. A divergence date of 7 million years ago is within the generally accepted 6 to 9 million years ago range for this divergence [28, 5355]. BEAST was run with the following parameters: Site Heterogeneity = ‘gamma’; Clock = ‘strict clock’; Species Tree Prior = ‘birth death process’; Prior for Time of Most Recent Common Ancestor (tmrca) = ‘Normal distribution’ with mean of 7.0 million years and 1.0 standard deviation’; ucld.mean = ‘uniform model’ with initial rate set at 0.033; Length of Chain = ‘10,000,000’; all other parameters were left at default settings.

The Network program [90] was run on gorilla-specific Alu Y subfamily consensus sequences to generate a stepwise tree of relationships between identified subfamilies [42, 77]. The consensus sequences for the gorilla-specific Alu Y subfamilies were aligned using the MUSCLE algorithm [76] and converted to the .rdf file format using the DNAsp program [91]. The .rdf file was then imported to Network, and a median-joining analysis was run. The resulting output file demonstrating evolutionary relationships between subfamilies is presented in Figure 1C.

PCR and DNA sequencing

To verify gorilla-specificity, locus specific PCR was performed with a nine-species primate panel comprised of DNA samples from the following species: Western lowland gorilla (Gorilla gorilla gorilla); Human HeLa (Homo sapiens); Common chimpanzee (Pan troglodytes); Bonobo (Pan paniscus); Bornean orangutan (Pongo pygmaeus); Sumatran orangutan (Pongo abelii); Northern white-cheeked gibbon (Nomascus leucogenys); Rhesus macaque (Macaca mulatta); African green monkey (Chlorocebus aethiops). Information on all DNA samples used in PCR analysis is listed in Table 1.

PCR amplification of each locus was performed in 25 μL reactions using 15 ng of template DNA, 200 nM of each primer, 200 μM dNTPs in 50 mM KCl, 1.5 mM MgCl2, 10 mM Tris–HCl (pH 8.4), and 2 units of Taq DNA polymerase. PCR reaction conditions were as follows: an initial denaturation at 95°C for 1 min, followed by 32 cycles of denaturation at 95°C, annealing at the previously determined optimal annealing temperature (60°C with some exceptions), and extension at 72°C for 30 s each, followed by a final extension of 72°C for 1 min. PCR products were analyzed to confirm gorilla specificity of all loci on 2% agarose gels stained with 0.25 ug ethidium bromide and visualized with UV fluorescence (Figure 3). A list of all 279 assayed loci, corresponding primer pairs, and optimal annealing temperatures for each are available as Additional file 4 in the Additional files for this study. Additionally, all PCR tested loci containing unidentified bases in the original sequence data were subjected to chain-termination sequencing to verify bp composition [92]. Sequence data generated from this project for gorilla-specific Alu Y subfamilies has been deposited in GenBank under the accession numbers (KF668269-KF668278).

Figure 3
figure 3

Phylogenetic assay of a western lowland gorilla-specific Alu insertion (Primer Pair Gor112). An agarose gel chromatograph of the gorilla specific Alu insertion Gor112. The filled site is approximately 550 bp (lane 7) and the empty site is 250 bp (lanes 3 to 6 and 8 to 11). Lanes (1) 100 bp DNA ladder; (2) negative control; (3) human; (4) bonobo; (5) common chimpanzee; (6) northern white-cheeked gibbon; (7) western lowland gorilla; (8) Sumatran orangutan; (9) Bornean orangutan; (10) Rhesus macaque; (11) green monkey; (12) empty; (13) 100 bp DNA ladder.