Background

The Equatorial forests of central Africa are one of the most biologically diverse regions in the world. Understanding the mechanisms that gave rise to such extraordinary tropical diversity remains a subject of intense interest to evolutionary and conservation biologists [1, 2]. Critical to this debate has been the role that Pleistocene climate change has played in tropical vertebrate speciation. The periodic fluctuations in climatic conditions that resulted from Earth’s orbital shifts are well known to have affected the range dynamics of many temperate taxa [3]. However, the extent to which these fluctuations impacted the distribution and diversification of tropical forest taxa has been a subject of intense debate. Proponents of Pleistocene refuge theory (e.g. [4]) have argued that the drier, cooler temperatures experienced during successive glacial maxima led to the repeated fragmentation of formerly contiguous forests, which in turn led to allopatric speciation of forest-associated taxa. Although there is currently little evidence to suggest that tropical forest refugia played a major role in the Amazon [57], support for their role as drivers of evolutionary diversification in tropical Africa is much greater (e.g. [812]). However, the majority of studies to date suggest that these effects are most evident at the population level [8, 1116] but see [17] with most species divergence times pre-dating the Pleistocene [1820].

The duikers in the subfamily Cephalophinae (family Bovidae) constitute an ideal group for testing the role of Pleistocene refugia in tropical vertebrate speciation because of their recent origin in the Late Miocene and subsequent rapid radiation [21, 22]. Currently, three duiker genera are recognized: (a) the recently derived, species-rich, forest dwelling Cephalophus (b) the dwarf Philantomba and (c) the monotypic savanna specialist Sylvicapra. Jansen van Vuuren and Robinson [22] have further sub-divided Cephalophus into three major mitochondrial lineages comprising the giant duikers (C. silvicultor, C. spadix, C. dorsalis, and C. jentinki), east African red duikers (C. leucogaster, C. rufilatus, C. nigrifrons, C. natalensis, C. rubidus, and C. harveyi) and west African red duikers (C. callipygus, C. weynsi, C. ogilbyi, and C. niger). To date, support for these lineages has been based solely on single mitochondrial gene genealogies that may not accurately reflect the evolutionary history of this group. Furthermore, the position of the two remaining taxa, C. adersi and C. zebra, remains unresolved and appears to be highly labile between studies [2224].

The goal of the present study is therefore to estimate a well-supported species tree for duikers using a combination of mitochondrial genes and nuclear introns previously shown to be highly informative in resolving relationships between other closely-related African bovids [25, 26]. We then use this tree to re-evaluate the evolutionary relationships of major lineages within this group and test the hypothesis that speciation of many duikers occurred during the Pleistocene epoch using a fossil calibrated relaxed molecular clock [27].

Results

The final aligned data matrix contained four unlinked nuclear DNA regions and two mitochondrial DNA regions for a total of 4152 characters, of which 1172 were from mitochondrial and 2980 were from nuclear loci (Table 1). Results from the pair-wise ILD test between the two mitochondrial partitions fail to reject the null hypothesis that both loci are congruent. As expected, the mitochondrial partition contained a greater proportion of variable sites (38%) relative to the nuclear matrix (15%). The mitochondrial partition contained 368 parsimony informative characters (31%) while the nuclear partition contained 208 parsimony informative characters (8%). The consistency index (CI) and retention index (RI) values in the mitochondrial partition (CI: 0.417, RI: 0.621) are lower than those of the nuclear matrix (CI: 0.823, RI: 0.846), indicating higher levels of homoplasy in the mitochondrial dataset.

Table 1 Sequence variability

The results of the pair-wise ILD tests between nuclear loci reject the null hypotheses that the four nuclear genes are congruent with one another (p < 0.10). Individual nuclear gene genealogies estimated using Maximum Parsimony (MP), Maximum Likelihood (ML) and Bayesian Analysis (BA) methods recovered different topologies (Figures 1, 2, 3 and 4; Additional file 1 for node support values) and these differences were evident regardless of method. Given that there was generally little support for most of the nodes within the individual nuclear gene trees, many of these differences between gene genealogies likely represent soft polytomies. However, there is also significant support for some of these differences. For example, the MGF genealogy supports two clades of giant duikers (MP bootstrap = 63% and 94%, ML bootstrap = 75% and 92%, BA posterior probability = 0.99 and 1.00) while the THY genealogy supports only one clade (MP = 84, ML = 85, BA = 0.99). The THY genealogy also supports the placement of S. grimmia as sister to the remaining Cephalophinae (MP = 90, ML = 91, BA = 1.00) with Philantomba sister to the giant duikers (MP = 89, ML = 93, BA = 1.00), unlike the PRKCl genealogy, which supports Philantomba as sister to the remaining Cephalophinae (MP = 83, ML = 81, BA = 1.00) or the SPTBN genealogy, which supports Philantomba as sister to only the red duiker clades (MP = 73, ML = 76, BA = 1.00).

Figure 1
figure 1

MGF gene genealogy. Majority-rule consensus tree showing the Bayesian estimate of nuclear gene trees for MGF. Thickened branches indicate nodal support by both BA posterior probability (PP) values ≥ 0.95 and ML bootstrap support (BS) ≥ 75. Additional file 1: Table S1 lists support values by node for this phylogeny.

Figure 2
figure 2

PRKCl gene genealogy. Majority-rule consensus tree showing the Bayesian estimate of nuclear gene trees for PRKCl. Thickened branches indicate nodal support by both BA posterior probability (PP) values ≥ 0.95 and ML bootstrap support (BS) ≥ 75. Additional file 1: Table S1 lists support values by node for this phylogeny.

Figure 3
figure 3

SPTBN gene genealogy. Majority-rule consensus tree showing the Bayesian estimate of nuclear gene trees for SPTBN. Thickened branches indicate nodal support by both BA posterior probability (PP) values ≥ 0.95 and ML bootstrap support (BS) ≥ 75. Additional file 1: Table S1 lists support values by node for this phylogeny.

Figure 4
figure 4

THY gene genealogy. Majority-rule consensus tree showing the Bayesian estimate of nuclear gene trees for THY. Thickened branches indicate nodal support by both BA posterior probability (PP) values ≥ 0.95 and ML bootstrap support (BS) ≥ 75. Additional file 1: Table S1 lists support values by node for this phylogeny.

All four genealogies support the monophyly of Philantomba (MGF: MP = 95%, ML = 95%, BA =1.0; PRKCl: MP = 96%, ML = 100%, BA = 1.0; STBN1: MP = 97%, ML = 97%, BA = 1.0; THY: MP = 83%, ML = 87%, BA = 1.0). Sylvicapra was sister to some or all of the giant duikers in three of the four nuclear trees (MGF: MP = 66%, BA = 0.52%; PRKCl: ML = 54%, BA = 0.61; STNB1: MP = 60%, ML = 56%, BA = 0.97) but this relationship lacked significant support. Within Cephalophus, support was generally weak or lacking for the giant, east and west African red duiker lineages described by van Vuuren and Robinson [22] although the STBN1 genealogy supported the monophyly of the west African red duiker lineage (MP = 85%, ML = 88%, BA = 1.0) and the THY genealogy recovered the giant duiker lineage (MP = 84%, ML = 85%, BA = 0.99). The position of C. adersi and C. zebra varied across genealogies and remained unresolved or weakly supported, with the exception of the MGF genealogy which supported C. zebra as sister to the C. jentinki/C. dorsalis clade (MP = 94%, ML = 92%, BA = 1.0) and the THY genealogy which supported C. adersi as sister to the east and west African red duikers (MP = 82%, ML = 80%, BA = 1.0).

As results from ILD and SH tests suggested that the mitochondrial and nuclear topologies are incongruent (p ≤ 0.006), both datasets were first analysed separately and then combined. The mitochondrial tree of all species within Cephalophinae (Figure 5) shows weak support for the monophyly of Philantomba (MP = 88%, ML = 53%, BA = 0.76), but has strong support for the sister placement of these taxa relative to all other Cephalophinae (MP = 68%, ML = 91%, BA = 1.0). Sylvicapra is sister to the giant duikers, although this node has weak support (MP = 32%, ML = 58%, BA = 0.84). Within Cephalophus, there is strong support for the monophyly of the giant duikers (MP = 98%, ML = 97%, BA = 1.0%), the east African red duikers (MP = 93%, ML = 98%, BA = 1.0), and the west African red duikers (MP = 87%, ML = 83%, BA = 0.99), but weak support for their placement relative to one another. The position of C. zebra and C. adersi is unresolved. There is also weak support for the paraphyly of C. rufilatus relative to C. nigrifrons (MP = 59%, ML = 44%, BA = 0.87) and strong support for the paraphyly of C. callipygus relative to C. ogilbyi and C. weynsi (MP = 98%, ML = 97, BA = 1.0).

Figure 5
figure 5

Mitochondrial gene genealogy. Majority-rule consensus tree showing the Bayesian estimate of the complete mitochondrial dataset. Thickened branches indicate nodal support by both BA posterior probability (PP) values ≥ 0.95 and ML bootstrap support (BS) ≥ 75. Additional file 1: Table S1 lists support values by node for this phylogeny.

In nuclear concatenated matrices, the harmonic mean of the log likelihood of the partitioned combined mitochondrial and nuclear Bayesian analysis (hm2) was equal to −15370.43 compared to the log likelihood for the unpartitioned analysis (hm1) of −15851.64, giving a value of 2 ln BF = −962.42 and providing strong evidence against a partitioned model. Alternatively, when the mitochondrial data are excluded from analyses, Bayes Factor analysis found strong evidence for the partitioned model (hm2 = −8088.25, hm1 = −7976.32, 2 ln BF = 223.86).

The concatenated nuclear tree (Figure 6) shows strong support for the monophyly of Philantomba (MP bootstrap = 100%, ML un-partitioned/partitioned bootstrap = 100/100, BA un-partitioned/partitioned posterior probability = 1.0/1.0). However, the sister position of this genus relative to the other duikers is not supported. Nuclear analyses also support a sister relationship between Sylvicapra and the C. silvicultor/C. spadix group (MP = 75%, ML = 66/75%, BA = 1.0/0.98), making both the genus Cephalophus and the giant duiker lineage paraphyletic. There is also support for the monophyly of the east African red duiker lineage (MP = 91%, ML = 91/98%, BA = 0.99/1.0), the west African red duiker lineage (MP = 67%, ML = 75/86%, BA = 1.0/1.0), and a sister relationship between these two red African duiker lineages (MP = 95%, ML = 100%/100%, BA = 1.0/0.82). Cephalophus adersi is sister to both the east and west African red duikers (MP = 60%, ML = 93/91%, BA = 0.93/0.74) and C. zebra is sister to the C. jentinki/C. dorsalis group (MP = 77%, ML = 59/70%, BA = 1.0/0.98). Unlike the mitochondrial tree, C. rufilatus and C. nigrifrons form reciprocally monophyletic clades (MP = 86%, ML = 82/95%, BA = 1.0/1.0 and MP = 99%, ML = 99/100%, BA = 1.00, respectively) in the nuclear tree. However, C. harveyi is paraphyletic with respect to C. natalensis, as is P. monticola to P. maxwelli. Finally, C. callipygus and C. ogilbyi form an unresolved polytomy.

Figure 6
figure 6

Combined nuclear tree. Majority-rule consensus cladogram showing the Bayesian estimate of the species tree from nuclear concatenated (left) and mitochondrial datasets. Thickened branches nodal indicate support by both BA posterior probability (PP) values ≥ 0.95 and ML bootstrap support (BS) ≥ 75. Additional file 1: Table S1 lists support values by node for this phylogeny. Boxes show major lineages on a grey scale, starting with the giant duikers in white, then the savannah duiker, the east African red duikers, the west African red duikers, and the dwarf duikers in darkest grey.

The concatenated mitochondrial and nuclear combined analysis yielded an almost completely resolved tree topology (Figure 7). Philantomba is both monophyletic and sister to the remainder of the Cephalophinae. Cephalophus was paraphyletic, with Sylvicapra as sister to the monophyletic giant duiker clade. The east and west African red duiker lineages are monophyletic and are sister to one another. While their placement is not strongly supported by all methods of estimation, Bayesian support places C. adersi as sister to the east and west African red duiker lineages (MP = 61%, ML = 73/65%, BA = 1.0/0.98). Placement of C. zebra as sister to the giant and savanna duiker lineages is not supported.

Figure 7
figure 7

Species tree. Chronogram illustrating species relationships and divergence times based on Bayesian analysis of the total evidence (i.e. concatenated mitochondrial and nuclear DNA). Node numbers refer to divergence time estimations given in Table 2. Thickened branches indicate nodal support by both BA posterior probability (PP) values ≥ 0.95 and ML bootstrap support (BS) ≥ 75. Additional file 1: Table S1 lists support values by node for this phylogeny.

Analyses in BEAST recovered the same topologies as those obtained by BA methods. However, the tree estimated using both nuclear and mitochondrial data was better resolved with higher support and narrower confidence intervals than the tree estimated from nuclear analysis alone. For this reason, we discuss only the results of estimation from both nuclear and mitochondrial data, although ages for nodes recovered in both analyses are presented in Table 2. The split of Philantomba from all other members of the Cephalophinae was estimated to have occurred during the late Miocene at 8.73 Ma (6.27-11.43 highest posterior density, HPD). This is followed by the divergence of the giant duiker and Sylvicapra lineage from the red duikers at 7.03 Ma (5.02-9.19 HPD), with C. zebra and C. adersi occupying a sister position relative to these two major groups. This major split is then followed by a subsequent split between the east and west African red duiker lineages at 4.98 Ma (3.58-6.69 HPD) during the Pliocene. With the exception of the dwarf duikers P. monticola and P. maxwelli all sister duikers species are estimated to have originated during the Pleistocene (< 2.558 Ma). These sister species pairs comprise C. jentinki and C. dorsalis, C. nigrifrons and C. rufilatus, C. natalensis and C. harveyi, C. spadix and C. silvicultor, and C. callipygus and C. ogilbyi.

Table 2 Divergence times

Discussion

Past attempts to reconstruct the evolutionary history of the Cephalophinae have met with considerable challenge [2224], likely owing to the recent and rapid radiation of this group [21, 22]. Using a combination of mitochondrial and nuclear markers, the present study provides the most well supported phylogeny to date. This study also provides convincing support for the position of Philantomba as sister to the remaining Cephalophinae and the recognition of the genus, as recommended by Jansen van Vuuren and Robinson [22]. In contrast, there is no support for Sylvicapra whose sister relationship to the giant duikers leaves Cephalophus paraphyletic. Instead, our findings suggests that S. grimmia represents the sole savanna-dwelling member of the giant duiker lineage within Cephalophus and likely evolved from a forest-dwelling common ancestor, further reinforcing Grubb's [28] belief that habitat transitions occur primarily from forest to savannah. While Jansen van Vuuren and Robinson [22] were correct in hypothesizing that the savannah duiker diverged early in the group’s evolutionary history, this study shows that its return to the savannah does not predate the appearance of other forest-dwelling taxa. The present phylogeny also provides much stronger support for the three main lineages of Cephalophus identified by Jansen van Vuuren and Robinson [22] and for the first time provides significant support for their placement relative to one another. The failure of this and earlier studies to place C. adersi and C. zebra in relation to these major lineages is more likely to be a reflection of the rapidity with which these older taxa may have radiated rather than a failure to resolve species nodes.

A comparison of the mitochondrial and nuclear DNA phylogenies also shed light on the evolutionary processes operating within this group. Because mitochondrial DNA has a quarter of the effective population size of nuclear DNA, mitochondrial haplotypes generally sort much more rapidly [29]. Thus, in recently diverged lineages it is expected that the paraphyly observed in mitochondrial DNA should would also be reflected in the nuclear data [30], as is observed for C. natalensis/C. harveyi and C. ogilbyi/C. callipygus. Incomplete lineage sorting would also explain the paraphyly observed in the nuclear DNA of species that exhibit reciprocally monophyletic relationships in mitochondrial analyses, as appears to be the case for C. sylvicapra/C. spadix and P. monticola/P. maxwelli. However, C. nigrifrons and C. rufilatus do not follow either of these patterns, exhibiting a paraphyletic relationship in mitochondrial analyses and a reciprocally monophyletic relationship in nuclear analyses. One explanation for these findings is that mitochondrial introgression between C. nigrifrons and C. rufilatus, followed by extensive back-crossing to the original parental taxa, may have obscured mitochondrial relationships but maintained their monophyly at the nuclear level. These two taxa occupy parallel distributions across central African, providing ample opportunities for hybridization. Interestingly, Bayesian analysis of the nuclear data provides support for a sister relationship between C. nigrifrons and the C. natalensis/C. harveyi clade, indicating that C. nigrifrons and C. rufilatus may not be sister taxa, as previously mitochondrial analyses suggest [2224].

We also report the surprising finding that while the origin of most major lineages within the subfamily date to the late Miocene/early Pliocene, many duiker species arose during the Pleistocene. From the mid-Miocene climatic optimum onwards, the earth has experienced a gradual cooling trend that continued through the Plio-Pleistocene [31]. The onset of much drier, colder periods at the boundary between these two epochs and subsequent intensification of glacial cycles throughout the Pleistocene is thought to have provided important opportunities for the diversification and increased turnover of African vertebrate species, including many arid-adapted bovids [21, 32]. Grassland expansion during glacial maxima would have confined forest adapted species to fragments of suitable habitat and broken up the formerly contiguous Equatorial African rainforest belt into several major refugia to the west, center and east of Africa [3335]. Such geographic isolation is thus postulated to have provided ideal opportunities for the allopatric fragmentation and speciation of tropical forest species [36]; reviewed in [9].

Despite the intrinsic appeal of this hypothesis, examples of Pleistocene-era tropical forest speciation are few. Opponents of tropical Pleistocene refuge theory have argued that many species divergence times pre-date the Pleistocene [37] or that forest refugia simply acted as reservoirs of genetic variation but did not drive speciation per se[20, 38]. Divergence times reported for many central African taxa support this claim [1820, 26, 39]; but see [17], although there is ample evidence for intra-specific diversification within many groups including tropical forest mammals [3].

Although our results are surprising, the divergences times we observed here are also consistent with earlier estimates for this group [22]. Moreover, many sister species occupy neighbouring yet allopatrically distributed species ranges. This is witnessed by the east versus west and western central African (i.e. Congo Basin) distribution of C. spadix and its sister taxon C. silvicultor, a pattern mirrored by the intra-specific structure of the roan antelope (Hippotragus equinus) [40]. Similarly, we have also observed splits between taxa that occupy a west versus western central African distribution, notably: 1) C. jentinki (west African) from C. dorsalis (west and western central African) 2) C. ogilbyi (west and western central African) from C. callipygus (western central African) and 3) P. maxwelli (west Africa) and P. monticola (western central, eastern and South African). This pattern is similar to the diversification of the murid rodent (Praomys misonnei), a forest associated taxon whose intra-specific distribution is also believed to have been driven by allopatric speciation during refugial isolation [41]. Similarly, there is ample genetic evidence to suggest that the split between eastern and western gorillas (Gorilla gorilla) arose during the Pleistocene as a result of climate-induced changes in forest cover this period [13, 42]. Lastly, we also observed a pattern of north versus south speciation in the forest-dwelling C. nigrifrons and its savanna-dwelling sister species C. rufilatus, as well as an east versus south African species split between C. harveyi and C. natalensis. Remarkably, of these sister species pairs, it is only in the case of C. ogilbyi and C. callipygus that their ranges overlap. Taken together, these data strongly suggest a pattern of Pleistocene-era fragmentation that led to the distribution of the sister species pairs that we see today.

The divergence times of the duikers however contrasts with findings from another forest bovid subfamily (the Tragelaphini) whose estimated speciation times range across the Miocene and Pliocene between 13 to 3 Ma [26]. More recent studies of Tragelaphus scriptus also point to a large sub-specific diversification within this taxon whose timing might also have been driven by Pleistocene climate change [15]. This then raises the question of the nature of species boundaries within the Cephalophinae and other African bovids and whether the timing of the radiation observed here more accurately reflects sub-specific diversification and/or incipient speciation, as is evidenced by several instances of paraphyly and/or hybridization between sister taxa.

Unlike earlier studies of the subfamily, the present study is the first to use both nuclear and mitochondrial data to estimate a species tree, and a fossil calibration point to date divergence times of duikers to the Pleistocene. The pattern of vertebrate radiation observed here fits that advocated by Avise et al.[43] where it has been postulated that Pleistocene glaciations either initiated intra-specific differentiation or furthered the speciation of lineages whose origin predated the Pleistocene. Further work should therefore investigate the extent of gene flow between recently derived species and use a coalescent approach to assess divergence times within this group [44].

Conservation implications

An accurate estimation of a species tree is often a useful precursor for guiding conservation and management decisions [45]. Phylogenetic analysis in this study finds significant support for the recognition of a distinct west African red duiker taxon C. rubidus which is geographically-restricted and was previously treated as a subspecies of C. nigrifrons within the east African red duiker clade. This apparent conflict lends strength to Jansen van Vuuren & Robinson's [22] recommendation that this taxon should be managed as a distinct species, elevating its conservation status from threatened to endangered [46]. The relationship between C. callipygus and the CITES protected species C. ogilbyi also appears problematic. Inclusion of nuclear data further substantiates the lack of any clear genetic distinction between these two taxa and is consistent with a history of either recent or on-going hybridization and/or incomplete lineage sorting. Given the results of the present study, it seems unlikely that any mitochondrial or nuclear marker will be able to differentiate these two taxa, posing a challenge to the regulation of the bushmeat trade [24, 47] or wildlife monitoring studies of field collected feces [23]. Further work should explore patterns of range-wide population genetic variation between these two taxa in order to better understand their species status and potential for hybridization.

Conclusions

Fluctuations in climate and increasing aridity over the past few million years are thought to have played an important role in shaping diversification of many African taxa [21, 32, 48]. Although many previous studies have shown that the majority of speciation events date to the Pliocene (e.g. [18, 20, 49, 50]), Pleistocene-era climatic oscillations are also thought to have played an important role in shaping patterns of diversification, particularly at the population level (e.g. [11, 13]). Here we report on a remarkably recent radiation of a group of duiker whose sister species pairs appear to date predominantly to the Pleistocene. As is the case for other forest artiodactyls, taxa within this group are tied to forest environments, thus highlighting the potential importance that Pleistocene refugia may have played in the speciation of forest-dwelling species. Data from this study also highlight several areas of inconsistency between our current understanding of duiker taxonomy and the evolutionary relationships depicted here. Consistent with their recent origin, several sister species groups exhibit paraphyletic relationships and/or evidence of recent hybridization. Further work should therefore aim to sample more widely across these sister taxa in order to better understand the geographic range of paraphyletic lineages and identify potential areas of introgression. These findings may also prove particularly relevant to future conservation efforts, given that many species are presently regulated under the Convention for Trade in Endangered Species and are therefore targets for the bushmeat trade.

Methods

Tissue was sampled from 24 individuals within the Cephalophinae, representing all eighteen species recognized by the International Union for Conservation of Nature (IUCN) [51]. Sequences were also obtained from Genbank of the newly discovered species, P. walteri[52], and one taxon that is considered a subspecies by IUCN (C. rubidus) (Additional file 2). With respect to a suitable outgroup, recent mitochondrial studies have suggested that the klipspringer (Oreotragus oreotragus) may be sister to the Cephalophinae [53, 54]. However, nuclear markers [25] and supertree analysis [55] do not provide support for this relationship, or for any consistent sister group to the Cephalophinae. Given the uncertainty of these relationships, we have included not only O. oreotragus as a candidate outgroup but also two other closely related taxa within the subfamily Antelopinae (the suni Neotragus moschatus and Kirk's dik-dik Madoqua kirkii) alongside two more divergent species within the subfamily Bovinae, the bushbuck (Tragelaphus scriptus) and the sitatunga (T. spekei).

Samples were obtained from bushmeat market surveys conducted in collaboration with the Wildlife Conservation Society (WCS) in Gabon, or donated by zoos and other researchers. With the exception of the easily distinguishable P. monticola and T. spekei, a photographic record was used to verify the species identity of all WCS collected bushmeat samples. Tissue samples of several species obtained from the San Diego Zoo and a fecal sample taken from C. jentinki at Gladys Porter Zoo were accompanied by species records. Details for all remaining samples are found in [2225, 56].

DNA from all bushmeat and some San Diego Zoo tissues was extracted using a standard phenol-chloroform extraction method [57]. DNA provided by Jansen van Vuuren was extracted according to the methods described in [22]. Other samples provided by the San Diego Zoo were obtained as genomic DNA extracts. The C. jentinki fecal sample was extracted using the QIAamp DNA Stool Minikit (Qiagen) in a designated room and a blank was included to control for DNA contamination. The C. harveyi sample AB05 was extracted from blood using a salt-based extraction method [58].

Portions of two coding mDNA genes were included in phylogenetic analyses: 514 bp of the cytochrome b (cytb) gene and 658 bp of the cytochrome c oxidase subunit 1 (COX1). See Additional file 2 for GenBank accession numbers for sequences obtained from previous studies or from the current study. Most (n = 36) of the cytb sequences were previously published [22, 23, 52, 59]. All Genbank sequences were trimmed to match the cytb region employed by Ntie et al.[23]. The cytb gene fragment from C. jentinki was amplified according to published primers and protocols [23]. Similarly, most (n = 31) of the COX1 sequences were previously published [24, 52] and an additional five samples were amplified according to published protocols [24]. To test for the potential presence of non-functional nuclear translocated copies of mitochondrial DNA (Numts), each of the mitochondrial gene sequences were translated to amino acids in the program MEGA v3. [60]. No evidence of frameshifts or stop codons were found.

Four nuclear DNA markers were also amplified and sequenced using published primers and PCR conditions [61]. These markers span introns within four genes: stem cell factor (MGF), protein-kinase-CI (PRKCl), B-spectrin non-erythrocytic (SPTBN1) and thyrotropin (THY). Internal primers were designed and used to amplify smaller fragments for samples that were highly degraded or difficult to amplify (Additional file 3). PRKCl, SPTBN1 and THY sequences for outgroup taxa M. kirkii, N. moschatus, and O. oreotragus were obtained from Genbank [25]. Following amplification, all PCR products were purified using ExoAp [62] and then sequenced on both strands using the BigDye Terminator Cycle Sequencing Kit v1.1 (ABI). Resulting products were run on a 3100 ABI automated DNA sequencer. Forward and reverse sequences were edited using the program SEQUENCHER v4.1.1 (Gene Codes Corporation, Ann Arbor, MI, USA). For nuclear loci, heterozygous individuals were verified by the presence of two similarly sized, overlapping peaks observed in both sequencing directions, and were coded using standard IUPAC ambiguity codes.

The incongruence length difference test (ILD; [63]) implemented in PAUP* vers. 4.0b10 [64] was used to evaluate incongruence between mitochondrial genes, nuclear introns and between combined mitochondrial and nuclear datasets. These ILD tests used 1,000 randomized partitions of the data and a heuristic search on each randomization to obtain the sum of tree lengths for each partition. The models of nucleotide substitution that best fit the data were selected by jModelTest [65, 66] under the Bayesian information criterion (BIC; [67]).

We also tested for topological concordance between phylogenetic trees derived from mitochondrial and nuclear DNA data partitions using the likelihood-based SH test [68] implemented in PAUP*. The ML tree for mitochondrial dataset was first estimated using jModeltest parameters for that partition. A second ML search on the same dataset was then carried out using the nuclear topology as a constraint. The significance of the difference in the sum of the site-wise log likelihoods of the two trees (unconstrained versus constrained) was then assessed using the SH test. A reciprocal test of topology was also conducted by first estimating the ML tree from the concatenated nuclear dataset and then estimating a constrained ML tree forced to fit the mitochondrial topology.

Gene trees were estimated for each of the four nuclear introns and the combined mitochondrial genes using maximum parsimony (MP), maximum likelihood (ML), and Bayesian (BA) methods (see below). Additionally, nuclear introns were concatenated with and without mitochondrial sequences into a single data matrix for species tree estimation using MP, ML, and BA methods. Nuclear sequences were not available for C. weynsi, C. rubidus and P. walteri either because the sample failed to amplify or because no tissue was available for the present study.

All MP analysis were performed in PAUP*. For each analysis, preliminary maximum parsimony searches were conducted using heuristic search methods with tree bisection reconnection (TBR) branch swapping, collapse of zero-length branches, all characters weighted equally, and 100 replicates of the random addition starting tree option. A nonparametric bootstrap test [69] was carried out using 300 replicates. The “Max Trees” was set to 50,000 for both initial searches and for the bootstrap tests.

Maximum likelihood analyses using a single model of nucleotide substitution for individual introns and concatenated mitochondrial and nuclear matrices were performed in PAUP* vers. 4.0b10 for UNIX. Heuristic searches were carried out using the TBR branch swapping algorithm, collapsing zero-length branches and using 100 replicates of the random addition option for the starting tree. Nonparametric bootstrap values were calculated from a consensus of the 300 replicate searches.

Two additional ML searches were conducted in RAxML vers. 7.0.4 [70] in which each nuclear intron was assigned its own model of nucleotide substitution with or without the inclusion of the mitochondrial data as an additional partition. Within each heuristic search, 500 discrete starting trees were used and a bootstrap consensus tree was estimated from the resulting trees. Each search used a GTR model of nucleotide substitution with the gamma model of rate heterogeneity initiated from a complete random starting tree. Model parameters were optimized to a likelihood difference of 0.00001. Each bootstrap analysis was repeated twenty times to explore tree space and ensure that each analysis converged on a similar likelihood score.

Bayesian analyses were carried out using the Metropolis-coupled Markov chain Monte Carlo (MCMC) methods implemented in MrBayes vers. 3.1.2 [71]. Each analysis included two independent, simultaneous runs. Each run consisted of four chains, one of which was the ‘cold’ chain and three of which were the chains heated according to the default heating method parameters of MrBayes. Each chain was run for up to 50 million generations, initiated from a random starting tree. The chain was sampled every 1,000 generations for a total of up to 50,001 tree samples per run. As simultaneous runs converged onto the stationary distribution, the average standard deviation of split frequencies should approach zero. Therefore, convergence was determined when the standard deviation of split frequencies between simultaneous runs was less than 0.01, as calculated by MrBayes. Additionally, trace files were evaluated with the program Tracer vers. 1.5 [72] and 10% of points collected prior to chain stationarity were discarded as burn-in. The parameter and tree samples from the two simultaneous runs were combined and summarized using the sump and sumt commands, respectively. For the first set of runs, BA searches assumed a single model of nucleotide substitution across the dataset. A second analysis was carried out in which nuclear genes were partitioned to allow each gene to have its own model of nucleotide substitution. This analysis was repeated with the mitochondrial DNA included as an additional partition. Bayes Factor (BF) analysis was used to investigate the effects of partitioning on the Bayesian analysis. Following [73], two times the natural logarithm of the Bayes Factor was calculated as 2 ln BF(21) = 2[ln(hm2)-ln(hm1)]; where hm2 and hm1 are the harmonic means of the post-burn-in likelihood values for the partitioned and un-partitioned analyses, respectively as estimated using the sump command in Mr. Bayes. The threshold of 2 ln BF > 10 was taken as strong evidence for the partitioned model [74]. Although the harmonic mean is not the best estimator of the marginal likelihoods used to compute the Bayes Factor, alternative methods [7577] are either computationally intensive or not readily implementable at this time.

Divergence times and tree topology were simultaneously estimated using the program BEAST vers. 1.6.1 [78]. BEAST analyses were run with and without the mitochondrial data because ILD tests indicate conflicting signal between nuclear and mitochondrial genomes. The likelihood ratio test implemented in PAUP* was used to determine if a molecular clock hypothesis could be rejected for each locus. Radiometrically dated fossil remains suggest that the earliest appearance of the Cephalophinae was between 6.31 – 5.65 Ma [21], which coincides well with the estimated oldest speciation event within Cephalophinae at 5.3 Ma (± 53,434 years) [22], using a cytb molecular clock calibration for the family Bovidae [59]. From this information, the prior on the age of the node uniting all taxa within the Cephalophinae was set as a lognormal distribution with an upper bound set as an offset value of 5.3 Ma, a log mean of 0.32 Ma from this offset value and log standard deviation of 1 Ma such that 95% of the prior probability encompassed the timeframe suggested by fossil evidence. The prior on the stem of the tree was set as a normal distribution with a mean of 20.1 Ma and a SD of 2.25 Ma. This prior distribution encompasses the dates (16.4 – 23.8 Ma) within which the split between the Bovinae and Antilopinae is believed to have occurred [53]. We unlinked the substitution models across nuclear genes, but left the mitochondrial genes linked. Because a molecular clock hypothesis could be rejected for all loci (MGF: χ2 = 150.96398, PRKCl: χ2 = 60.019, STBN1: χ2 = 115.41636, THY: χ2 = 128.10964, mitochondrial: χ2 = 223.16424, d.f. = 30, p <0.05), we used a relaxed, uncorrelated lognormal clock model and a Yule tree prior as implemented by the program. All other priors were left at their default settings. Two independent MCMC chains were run for 10 million generations and sampled every 1000 states, after which convergence was determined when the combined independent chains yielded posterior probability effective sample sizes (ESS) greater than 200. After examining trace files, the first 25% of the samples were discarded as burn-in and the remaining 7,501 samples from each run were combined in Logcombiner for a total of 15,002 sample genealogies per analysis. Tree Annotator was used to summarize the trees into a single maximum clade credibility tree.