Red colobus monkeys (genus Procolobus; Grubb et al. 2003; Oates and Davies 1994) are among the most threatened African primates due to hunting and habitat modification/loss (Struhsaker 2005; Ting 2008). While habitat modification can drive disease susceptibility (Brearley et al. 2013), there remains a lack of immunogenetic data for red colobus, which precludes the advancement of disease association studies. Few wild, endangered primates have been characterized for immune genes, including major histocompatibility complex (MHC). MHC genotyping of wild primates remains challenging principally because of logistic constraints and because wild primates such as red colobus are non-model organisms (Sommer et al. 2013).

We tested a well-characterized MHC deep sequencing pipeline, developed for macaques (Karl et al. 2014), in wild red colobus. We genotyped a highly polymorphic 156 base pair region of MHC class I exon 2. Nine blood samples from Ugandan red colobus (Procolobus rufomitratus tephrosceles) from Kibale National Park were collected in EDTA in 2006 and 2010. One tissue sample from Temminick’s red colobus (Procolobus badius temminckii) was collected from Abuko Nature Reserve in the Gambia (Fig. 1). Genomic DNA was extracted using QIAamp DNA Mini Kits (Qiagen) following the manufacturers protocol and normalized to 10 ng/μl. Primers used were (SBT195-F: GCCTCGCTCTGGTTG- TAGTAG, SBT195-R: GGGCTACGTGGACGACAC; Karl et al. 2014). PCR amplification in 25 μl reactions contained 10 ng of DNA, 12.5 nM of each primer, 100 nM of Fluidigm barcoded Illumina adapter primers, 12.5 μl 2× Phusion High Fidelity PCR Master Mix (New England BioLabs), and 7.5 μl of water. PCR conditions included an initial denaturation for 3 min (98 °C), followed by 25 cycles of 5-s denaturation (98 °C), 10-s annealing (60 °C) and 20-s extension (72 °C), followed by a final 5-min extension (72 °C). PCR products were purified using an Agilent AMPure XP (Beckman Coulter) bead cleanup. Purified products were quantified using PicoGreen chemistry (Thermo Fisher Scientific). We pooled together equimolar concentrations of each sample and carried out two additional AMPure XP cleaning steps to remove remaining dimers. The library was quantified using the Qubit™ High-Sensitivity dsDNA Assay (Invitrogen) and sequenced as part of a single, paired-end run on an Illumina MiSeq using the 500-cycle MiSeq Reagent Kit v2 (Illumina, San Diego, CA, USA).

Fig. 1
figure 1

Haplotypes for MHC I-A/B, demographic information, sample type, and number of reads for Ugandan and Temminick’s red colobus

Raw Illumina reads were demultiplexed and primer and adapter sequences removed using Flexbar (Dodt et al. 2012). Primers were trimmed in left trim mode with a minimum overlap equal to the length of the primer with no unknown bases allowed. Sequences were then trimmed from the right to a maximum length of 156 base pairs. Paired-end reads were merged using FLASH under default settings (Magoc and Salzberg 2011). Merged reads were mapped against a curated database of known macaque MHC class I alleles from the Wisconsin National Primate Research Center (WNPRC) using Bowtie, allowing zero mismatches (Langmead et al. 2009). Mapped reads were removed as contamination (<0.0023 %), and unmapped reads were retained as red colobus sequences. Sequences were searched against the WNPRC macaque database using BLAST in Geneious Pro v. 6.1.2 (Biomatters) allowing a 1 % mismatch (Fig. 2). Sequences were assigned to a locus based on closest match in the macaque database. A minimum threshold of 100× coverage was used, and we retained only sequences that matched the expected sequence size (156) ± 3 base pairs in order to remove sequences in which a reading frame shift had occurred. We translated the sequences and searched for stop codons using the second base position as the start of the reading frame, and sequences containing a stop codon were designated as pseudogenes. Alleles with a Mean Per Amplicon Frequency (MPAF) <0.01 were visually checked for chimeras and sequencing errors (no chimeras identified, one allele removed as sequencing error) following Zagalska-Neubauer et al. (2010).

Fig. 2
figure 2

Bioinformatic pipeline for novel MHC I haplotype identification

In total, 82 alleles (including two pseudogenes) were identified that mapped closely to six macaque (Macaca mulatta) MHC I loci (doi:10.5061/dryad.6r40g). Haplotypes were determined based on patterns of allele sharing across individuals (see Supplemental Material S1). The majority of haplotypes represented MHC I-A (6 haplotypes, 16 alleles) and MHC I-B (7 haplotypes, 41 alleles), suggesting that, like macaques, MHC I has undergone extensive duplication events in red colobus. Number of reads per individual ranged from 7332 to 34,611 (average = 12,156). Six Ugandan red colobus were homozygous for the A001a, one was homozygous for A002, and three were heterozygotes. A005 and A006 were restricted to Temminick’s red colobus. One Ugandan red colobus was homozygous for B003, and eight were heterozygous. B007 was restricted to Temminick’s red colobus.

We have demonstrated that a protocol for genotyping MHC I in macaques can be applied successfully to a non-model primate, the red colobus. This finding shows that rapid and accurate MHC genotyping in endangered primates is currently feasible, given the availability of suitable biomaterials. Because our primers were designed for a distantly related Old World monkey, we predict that this approach will be successful across the family Cercopithecidae. Applying our method more broadly should facilitate studies of the genetic determinants of disease susceptibility as well as assessments of immunogenetic diversity in isolated primate populations.