Background

Genomic polymorphisms and rearrangements tell the tale of speciation and genomic evolution. Meaningful comparative phylogenetic and cytogenetic studies of species and populations require the assembly of chromosomal reference genomes [1, 2]. In non-model species, full sequencing of chromosomes remains rare [3, 4]. A daunting task facing mappers of chromosome scaffolds is the sequencing and positioning of repeat-containing sequences [5]. Although new technologies such as long read sequencing and proximity tagging of DNA fragments are greatly improving the quality of assembly, positioning of super-scaffolds into structured chromosomes remains challenging. Physical mapping on chromosomal spreads is a proven method of complementing bioinformatic-based assembly [6,7,8].

Chromosomal assembly is focused on one species at a time. However, reference genomes of sufficient quality can serve as templates for other species. Due to its circumpolar distribution and sensitivity to environmental changes [9,10,11], Rangifer tarandus (subfamily Capreolinae, family Cervidae), commonly known as caribou in North America and reindeer in Eurasia [9, 12], is a prime example of a key species monitored for documenting the impacts of climate change and habitat perturbation. North American caribou populations roam in Alaska, Greenland and northern Canada [13, 14]. Reindeer are found over the entire Northern Eurasian continent and on Svalbard [15, 16]. Many North American and Eurasian herds have been recently declining [11, 17,18,19].

To protect Rangifer, accurate genomic information is crucial, particularly for monitoring genetic diversity over time and for forensic purposes [20, 21]. As Rangifer taxonomy has recently been questioned [12], this information becomes instrumental to document and study genomic differences between species and subspecies. Several genomic assemblies have been published [20, 22,23,24] without mapping of entire chromosome. Based on cytogenetic analyses, the Rangifer tarandus genome is divided into 33 autosomal acrocentric chromosome pairs, a submetacentric autosome pair, and the sex pair, of which the Y is acrocentric, and the X is metacentric [25]. No karyotypic differences between European (reindeer) and American (caribou) populations are known [26]. Most studies of R. tarandus chromosomes go back decades [25,26,27,28,29,30], and from conventional G and C banding, an idiogram-based classification of chromosomes has been proposed [29, 30]. However, R. tarandus autosomes are particularly difficult to identify even with the low-band G-banding pattern because they are of very similar size and shape.

Information obtained from genome sequencing has led to the design of specifically targeted DNA probes. Single-locus FISH probes have been used widely for many years to position and visualize gene loci [31,32,33]. Development of massively paralleled de novo synthesis procedures has allowed vast increases in numbers of targetable loci [34,35,36]. PCR-synthesized oligonucleotides (oligo), both long [35] and as short as 70 nucleotides [34, 37, 38] are interesting for FISH studies since it allows generation of a renewable and affordable stock of probes that can be used for downstream cytogenetic research. For instance, pools of short synthetic oligo have proven useful for many studies such as chromosome pairing during meiosis [39], karyotyping [40], chromosomal evolution [41], genome architecture [42] and chromatin folding [43]. Recently, it has been used to upgrade the banana (Musa spp.) genome with 19 libraries of 20,000 45-mers oligo [44].

The present work aimed using Oligopaint FISH probes to map R. tarandus genome scaffolds physically onto chromosomes and thereby updating a scaffold-level assembly proposed previously [20]. To our knowledge, this technology has not yet been used to assess scaffold validity or map chromosomes in mammals. The resulting high-quality reference genome should prove useful as a template for studying other species and provide a solid basis for comparative and evolutionary genomics.

Results

Probe design

Each Oligopaint FISH probe consisted of a large group of oligos separated by short gaps. Probe sequences were selected using publicly available tools, namely OligoMiner, Ifpd, and the OOD-FISH pipeline [38, 45]. After completing all probe design steps, 255,000 79-mers were selected and distributed in three pools of 91,500 oligo each for massive parallel synthesis. These oligos were distributed in 170 probes covering the 68 largest scaffolds (> 9.5 Mb) of the latest published genome [20], which corresponds to 78% of the assembled genome (2.01 Gb).

Probe distribution had to account for scaffold size and provide a specific colour pattern for identification purposes. For example, a set of five probes was designed for the first and largest scaffold while scaffolds 2 and 12 were addressed with sets of four probes since mis-assembly in the draft genome was suspected. In addition, 29 scaffolds were labelled with sets of three probes, 34 with sets of two probes, and two single probe sets were designed for the last two (shortest) scaffolds. The scaffold colour schemes are detailed in Fig. 1.

Fig. 1
figure 1

Oligonucleotide’s structure and scaffolds’ color pattern. a Oligo comprising a reverse primer sequence (yellow), genome homolog sequence (purple), forward primer sequence (orange) and adapter (green) complementary to the detection oligo (blue) linked to the fluorophore (red). Hybridization of the homolog with its specific genomic complementary sequence (black) is followed by hybridization of the fluorophore-bearing oligo. b The 68 scaffolds to be assembled on chromosomes are painted with one to five probes each labeled blue, yellow, or red, giving these three colours or green (blue + yellow), orange (yellow + red) or violet (red + blue). Two-probe patterns were repeated among two or three scaffolds because of limited color possibilities but were hybridized separately to ensure accurate chromosome attribution

Each probe (1500 oligo) spanned on average 401.5 kb of genome (149.4–1121.4 kb, SD: 139.9 kb). Within the same scaffold, probe sets were separated by a mean gap of 12.85 Mb (7.4–24.3 Mb, SD: 3.68 Mb) and located at an average distance of 4.64 Mb (0.25–14.13 Mb, SD: 3.31 Mb) from the scaffold termini. Probe set spacing was highly influenced by sequence quality within the targeted region, where low complexity, high probabilities of dimer or secondary structures and repeated elements could be avoided.

Chromosome assembly

The position of each of the 68 scaffolds was determined by hybridizing the corresponding probe sets alone or with the probe sets of other scaffolds to generate distinctive color patterns (Fig. 2). Initial hybridizations targeted presumptive chromosomal membership based on previous scaffold mapping of the bovine genome [20], which reduced total number of hybridization steps since it was not totally random. Full assembly was successfully achieved by permutating probe sets of a few scaffolds at a time for each chromosome spread hybridization and by comparing results across multiple hybridizations. Chromosomal mapping was confirmed by rehybridization of tandem scaffolds on additional chromosomal spreads. Scaffold mapping on both caribou and reindeer cells revealed no chromosomal differences between sub-species. Chromosome mapping of all 68 scaffolds was shown to cover all autosomes and a euchromatic part of the X chromosome (Fig. 2). These results showed that the oligo-based cytogenetic method can be successfully used for superscaffolding. Scaffolds were organized and oriented to significantly improve the reference genome quality. For instance, the resulting assembly is less fragmented as indicated by the significant increase in the N50 value (Table 1). At the 50% mark of the total genome assembly, the shortest contig is now of 54,365 kb and nearly doubled in size compared to the previous value. Larger genomic fragments also translate into a significantly lower L90 value decreasing by 37 scaffolds. This means that 90% of the entire genome (34 autosome pairs and the sex chromosomes) is now comprised into only 94 fragments (Table 1).

Fig. 2
figure 2

Hybridization of probes for scaffold mapping. a Hybridization of a group of ten scaffold probe sets, the first step for scaffold anchoring. Scaffold probes were labeled with FAM (cyan), ATTO-550 (yellow), ATTO-647 (red) or with a pair to generate another color (green for FAM + ATTO-550, orange for ATTO-550 + ATTO-647, violet for FAM + ATTO-647). b Example of unclear or mis-assembled signal scaffolds confirmed by hybridization in a smaller group. c Whole karyotype covered by the 68 scaffold probe sets allowing identification of all chromosomes with a single hybridization. All scaffolds were assigned to one pair of chromosomes by identifying their color scheme. Scale bar = 5 μm

Table 1 Comparison of R. tarandus genome assemblies proposed by different authors

Scaffold mis-assembly

The putative location of each probe within scaffolds was validated by hybridization on several chromosomal spreads of both caribou and reindeer cells. Six of the 68 scaffolds initially assembled by bioinformatics were found to be mis-assembled. Five of these were in fact composed of sequences belonging to two different chromosomes (Fig. 3 and Fig. S1). Chimeric scaffolds were all confirmed by FISH block split on chromosomal spreads. Break points were determined based on synteny with the Bos taurus genome. These are presented in Table S1 along with ID of probes surrounding break points and the bovine chromosomal homologs to the scaffold blocks. The sixth mis-assembled scaffold (number 21) contained an ordering error, revealed by a colour pattern that was differently ordered than initially expected (Fig. 3e and Fig. S1e).

Fig. 3
figure 3

Validation of doubtful scaffolds integrity visualized on in situ chromosomes. White arrows in images ad indicate mis-assembled scaffold splits on chromosome pairs: scaffold 1 on chromosome pairs 7 and 20, scaffold 2 on 16 and 18, scaffold 12 on 1 and 4, scaffold 20 on 2 and 34. Green arrows indicate well-assembled scaffold 7 on chromosome pair 27. Probes hybridized also with non-lysed cells (N). e Reorganized scaffold 21, the first part (including the first probe) is inverted and positioned after the second and third probes. f Mis-assembled scaffold 25 split on chromosome pairs 3 and 8. Scale bar, 5 μm

Chromosome ordering and idiogram building

Once all scaffolds were positioned, chromosome orientation was determined by positioning the centromeres. Given that nearly all autosomes are acrocentric and of similar size, centromere positions and scaffold orientation were determined by visualization of colour patterns on late metaphase chromosomes, since chromatids are easier to distinguish at this stage (Fig. 4). This method led to the positioning of all centromeres based on several observations for each chromosome. By cytogenetic convention, autosomes are usually grouped according to conformation or structure and ranked from the longest to the shortest. Here, chromosomes were ordered from acrocentric, submetacentric to metacentric and by decreasing length [30] (Fig. 5). Length was expressed relative to the X chromosome, which is the largest and therefore the most easily identifiable of all Rangifer tarandus chromosomes. Using this ratio avoided any length bias caused by chromosomal compaction related to cell mitotic cycle, which can vary across spreads. The relative lengths were used for chromosome ordering and numbering. Chromosome mapping of all 68 scaffolds and their respective lengths are presented in Table 2.

Fig. 4
figure 4

Centromere positioning revealed by hybridization on late metaphase chromosomes. More condensed chromosomes and clear separation of chromatids make centromere easier to distinguish. Inset: chromatid attachment point (centromere to the upper left side) visualized on R. tarandus chromosome 1. Scale bar = 5 μm

Fig. 5
figure 5

Idiographic and physical mapping of Rangifer tarandus scaffolds. All scaffolds were positioned on a chromosome. Centromeres are represented by black tips or bands. The first 33 chromosomes are acrocentric. The only submetacentric chromosome is #34. The X chromosome is the largest and is metacentric. All X chromosome scaffolds were placed on the short arm. The Y chromosome was not studied

Table 2 Rangifer tarandus chromosome lengths and current scaffold mapping

Investigation of synteny and evolution

To test the value of this novel Rangifer tarandus reference genome assembly, we compared it first with bovine chromosomes. A previous bioinformatic comparison [20] predicted high synteny between the two species, which was confirmed by the observed chromosomal rearrangements and highlighted in Fig. 6. Correction of the six mis-assemblies that we found further emphasized synteny (Fig. 7). We then used the new assembly to model the evolution of chromosome 1 across related species by comparing orthologous sequences in species of Cervidae and Bovidae for which draft genomes are publicly available. Two chromosomal rearrangements were found common in all studied Cervidae species (Cervus elaphus (red deer), R. tarandus and Odocoileus hemionus (mule deer) compared to Bos taurus and Capra hircus (domestic goat) (Fig. 8). Additionally, a third rearrangement observed between R. tarandus and Odocoileus hemionus and the other species is suspected in their evolution (Fig. 8). The most parsimonious explanation for these rearrangements is that the common ancestral chromosome 1 first split in the Cervidae lineage to give rise to a small acrocentric chromosome (containing one R. tarandus scaffold, in green) and a larger chromosome (containing three scaffolds, in red, yellow and blue) (Fig. 8). A translocation within the larger resulting chromosome then relocated the distal portion near the centromere. The chronology of these two events were not determined though. Finally, a pericentric inversion of the proximal portion of the chromosome occurred, leading to a submetacentric configuration in the genera Odocoileus and Rangifer (Fig. 8).

Fig. 6
figure 6

Sankey diagram illustrating associations between Rangifer tarandus chromosomes (RtChr, left) and Bos taurus chromosomes (BtChr, right). Chromosomal rearrangements (fusion and fissions) are shown in colour. Except for the coloured chromosomes, chromosomal order is maintained overall, the biggest discrepancy being of four chromosomal positions (i.e., RtChr13)

Fig. 7
figure 7

Jupiter plot representing mapping of the corrected R. tarandus assembly on the B. taurus genome. The 111 largest scaffolds (on the right) show high synteny with the 29 autosomes plus X chromosomes from the cattle assembly (ARS-UCD1.2). Intersecting bands, which represent non-syntenic regions between the two species, are fewer in comparison with the previous mapping [20]

Fig. 8
figure 8

Suggested evolution of bovine chromosome 1 ancestral ortholog in Cervidae and Bovidae. Relative to bovine chromosome 1 and its caprine ortholog, three events appear to have been passed on to R. tarandus and O. hemionus. First, fission of the common ancestor’s bovine chromosome 1 ortholog gave rise to a short acrocentric chromosome (containing R. tarandus scaffold 7, green) and a longer one (containing scaffolds 13, 20 and 56, yellow, blue, red). A translocation then occurred within the distal portion of the longer one, near the centromere. The order of these two events has not been confirmed. Finally, a pericentric inversion of the proximal part of the longer chromosome occurred, leading to a submetacentric configuration in the genera Odocoileus and Rangifer

Discussion

The work presented herein leads to the development of a karyotyping method for Rangifer tarandus. Since most autosomes are similar in this species, the capacity to generate color banding patterns specific to each chromosomes provides an interesting tool for chromosome identification. Specific sets of probes can be selected and reamplified to highlight a subset of chromosomes or of chromosomal regions.

The color patterns allowed to significantly improve the genomic reference built from scaffolds assembled bioinformatically to a chromosome-level assembly. Completely sequenced chromosomes represent major scientific advances which undoubtedly represent a valuable resource for further genomic studies. However, very seldom are initial draft genomes assembled to this extent. The use of proximal tagging strategies and various DNA sequencing platforms that provide different coverage and read lengths can yield sufficient data to generate super-scaffolds [2, 3, 5, 46], but even these large genomic fragments rarely cover entire chromosomes. One of the main obstacles is the presence of repeated elements that make it impossible to establish with certainty the relative positions of scaffolds [5]. Cytogenetics offers a complementary strategy to bioinformatics by physically positioning sequences of interest through in situ hybridization [4, 7, 44, 47,48,49,50,51].

To date, animals for which complete chromosome-mapped genomes are available for reference purposes include humans [52], mice [53], Drosophila [54], zebra fish [55], chickens [56], cattle [57], swine [58], sheep [59], and goats [60]. However, several wild species of conservation concern have not been fully chromosome-assembled [3, 4, 6, 61]. For instance, four Rangifer tarandus draft genome assemblies have been published in recent years [20, 22,23,24]. However, none have reached chromosome-level assembly. Herein, we used Oligopaint FISH probes (modifiable synthetic short oligonucleotides) to anchor the 68 currently largest Rangifer tarandus scaffolds from the most recent R. tarandus genome [20] to chromosomes. The oligonucleotide-based technology was chosen instead of the de novo synthesis of DNA probes from bacterial artificial chromosomes (BAC) since this latter method is less efficient and more expensive [7, 62,63,64]. Another alternative could have been to use existing probe-sets developed in other species but even with meticulous sequence selection, hybridization success rate can vary considerably [48]. Moreover, these probes generally target both single-copy sequences and repetitive sequences, which significantly reduces specificity and possible applications [65].

Long synthetic oligo have been used to probe genomic regions as small as 6.7 kb [33, 35]. Shorter probes (≤ 100 bp) have been effective for 10 kb regions [34, 36], albeit more efficient when targeting 52 kb to 2.1 Mb to hybridize on nuclei [36]. The probe sets designed in our study targeted 401.5 kb on average, a length for which our probe sets containing 1500 oligo appeared to be optimal. Mean oligo density per probe was thus 3.74/kb, lower than the 5.5/kb previously used for targeting 500 kb [36]. However, mean densities as low as 1.71/kb have been used successfully to hybridize with 500 kb sequences chosen within chromosomal sets [42]. Our hybridization results show that small variations in the number of oligo per probe, oligo density and the targeted sequence length has little impact on detection, and that the method is therefore robust and flexible.

Another parameter that could influence probe detection in oligo-based technology, particularly the resolution at which probes are distinguishable from each other, is inter-probe distance. In the present study, the lowest inter-probe distance (excluding the distance between probes of adjacent scaffolds) was 7.4 Mb, that is, between the first and second probes of chromosome 8. This is consistent with previous studies in which a 7–8 Mb gap between probes was sufficient [33, 41]. The minimal inter-probe distance to ensure acceptable probability of detection and visualization is the main reason why smaller scaffolds could not be used. Overall, our probe design parameters allowed us to map all selected scaffolds, which account for 78% (2.01 Gb) of the entire R. tarandus genome. This coverage is comparable to other bioinformatic and FISH-assisted genome assemblies [4, 48].

We are also proposing a new R. tarandus idiogram based on fluorescent banding patterns. Previous published karyotypes differently ordered chromosomes, and banding patterns were most often of low resolution to clearly identify R. tarandus chromosomes especially since they are nearly all acrocentric and many are of similar size [30, 66, 67]. Moreover, the submetacentric chromosome was differently placed across karyotypes, either placed as the first autosome [67], or the last one [25, 30, 31] or amongst the acrocentric [66]. Herein, ordering is based on chromosome size evaluated by repeated length measurements and placing the submetacentric chromosomes as the last autosomes. A recent publication reports a higher resolution R. tarandus G-banding karyotype offering a different chromosomal ordering where the submetacentric pair is not identified [68]. While we used B. taurus as reference for comparative analysis, Proskuryakova and colleagues used a comparative approach using probes derived from Camelus dromedarius [68]. The two reference species are known to harbour several evolutionary chromosomal rearrangements [69] preventing clear chromosome identification between studies.

While the probe-sets were designed to position and orient all selected scaffolds onto the chromosomal spreads, since the initial scaffolds were obtained from a bioinformatics assembly, the designed probe-sets were also confirming the existence of the in silico derived fragments and allowed the detection of chimeric sequences. A total of 18 breaks in synteny were previously identified when mapping R. tarandus scaffolds to a bovine genome [20]. Such analysis cannot distinguish true chromosomal rearrangements from chimeric assembly. The split scaffolds identified in the present study match five of the seven largest previously identified potential inter-species discrepancies, confirming that these were examples of scaffolding errors. Furthermore, colour patterns did reveal an intra-scaffold rearrangement. All corrected scaffolds were among the 40% largest, which seems concordant with the increased number of sequence matching events needed to lengthen the scaffolds. Several reference genomes have been corrected after publication by visualizing potential errors through FISH probe hybridization [6,7,8], thus supporting the usefulness of cytogenetics in genome assembly. Corrected scaffolds reduced syntenic breaks observed previously [20] and further confirmed the reported high synteny between Cervidae and Bovidae [22, 23, 70].

Despite this high synteny between Cervidae and Bovidae, chromosomal rearrangements that were not highlighted in the previous genome mapping [20] were revealed by cytogenetics. Several studies comparing species in the infraorder Pecora revealed evolving chromosomal rearrangements [25, 47, 49, 71,72,73,74,75]. A chromosome painting study comparing B. taurus and several deer species including red deer, milu deer (Elaphurus davidianus), rusa deer (Cervus timorensis russa), Eld’s deer (Rucervus eldii), fallow deer (Dama dama), roe deer (Capreolus capreolus), Chinese muntjac (Muntiacus reevesi) and moose (Alces alces) has revealed karyotype differences traceable to fission of cattle chromosomes 1, 2, 5, 6, 8 and 9 and tandem fusion of cattle chromosomes 26 and 28 [49]. Our results support the same karyotypic evolution and suggest that bovine chromosome 28 has the same centromeric region as R. tarandus chromosome 6 (Fig. S2). We therefore hypothesize that bovine chromosome 26 centromere formed after the fission. The centromere of bovine chromosome 28 has been associated also with the centromere of C. elaphus chromosome 15, which also contains both bovine chromosomes 26 and 28 [70]. Although fission of B. taurus chromosomes 26 and 28 was unambiguously predicted by bioinformatics [20], fission of chromosomes 1, 2, 5, 6, 8 and 9 was not, thus showing the usefulness of physical mapping.

Bovine chromosome 1 represents an interesting case as it has been associated with many chromosomal rearrangements among Cetartiodactyla [49]. In nine Cervidae species studied in that review, bovine chromosome 1 was found to be split into a smaller acrocentric chromosome and a larger acrocentric or submetacentric chromosome. In our mapping, the scaffold associated with the proximal part of bovine chromosome 1 is located alone on the small acrocentric R. tarandus chromosome 27, and the distal part is located on the submetacentric R. tarandus chromosome numbered 34.

To explore Cervidae chromosomal evolution further, we mapped the bovine first chromosome related scaffolds to the latest versions of the mule deer genome (Odocoileus hemionus; GCA_020976825.1) and the red deer (Cervus elaphus; GCA_910594005.1) genome. It has been reported that the distal portion of the larger chromosome resulting from the split of the bovine chromosome has undergone a translocation to the middle/proximal region in several Cervidae species [47, 49, 70]. Furthermore, in the Capreolinae subfamily, containing the genera Odocoileus, Rangifer and Alces among others, a pericentric inversion within a large acrocentric chromosome leading to a submetacentric type has been reported [25, 29, 71]. Based on suggested karyotype evolution [25, 71], cross-species hybridization [49] and genome assembly [70], C. elaphus does not contain this pericentric inversion. Our mapping shows the same scaffold order on C. elaphus, O. hemionus and R. tarandus, suggesting that these three cervid genomes contain the same translocation. The pericentric inversion was not confirmed directly by FISH since no non-inverted chromosome was probed for comparison. However, BAC probe hybridization results for R. tarandus chromosomes [49] and our R. tarandus assembled genome comparison with B. taurus genome (Fig. S3) tend to support this rearrangement. Based on these observations, we suggest a karyotype evolution scheme including Bos taurus, Capra hircus, Cervus elaphus, Rangifer tarandus and Odocoileus hemionus (Fig. 8) in which fission of an ancestral bovine chromosome 1 ortholog gave rise to a small acrocentric R. tarandus chromosome containing one scaffold (in green) and a larger one containing three scaffolds (yellow, blue, and red) within which a translocation moved the distal portion to near the centromere in extant Cervidae. The chronology of these two chromosomal rearrangement events remains to be determined. Finally, a pericentric inversion of the proximal portion occurred, leading to a submetacentric configuration in the genera Odocoileus and Rangifer (Fig. 8). We expect that both the translocation and the pericentric inversion occurred in Alces alces and other Odocoileus species since their karyotypes are closely related according to previous phylogenetic mapping [25]. Further FISH experiments will be needed to test this hypothesis. Since cross-species hybridization can sometimes proves to be informative specially to confirm specific evolutionary genomic reorganizations [47, 49], all probes developed herein for R. tarandus have been made available (supplemental data).

Conclusions

Mapping 78% of the Rangifer tarandus genome onto chromosomes adds considerable value to the reference genotyping of this species. These results will provide resources for future studies of caribou and reindeer phylogenetics, conservation genetics and cytogenetics. The Cervidae family is remarkable for the high diversity of its chromosome shapes and numbers, which represents both challenges and opportunities for scientific research. Rangifer comprises several species and subspecies that have yet to be fully characterized genetically. Since many circumpolar caribou and reindeer populations are listed as endangered or threatened, precise genomic information for conservation and management will become an important asset.

Methods

All chemicals were purchased from ThermoFisher (Mississauga, ON, Canada) unless specified otherwise.

Probe design

Probes were designed as described previously [45] using 1500 synthetic oligo by probe. Each oligo consisted of a scaffold-specific 39-mers homolog flanked by reverse and forward primer sequences, the latter extended by a 5′ adapter complementary to a fluorophore-bearing detector sequence, as shown in Fig. 1. The primers, adapter and detector sequence were all 20-mers. The 39-mers sequences were designed on the Rangifer tarandus repeat masked genome [20] using OligoMiner tools [38] in the “balance” default mining mode with few exceptions. The minimal and maximal lengths were set at 39-mers in blockParse.py (−l and -L) to ensure candidate length homogeneity. Unique Mode (UM) and Linear Discriminant Analysis Mode (LDM) were used for Bowtie2 alignment and outputClean.py steps. Results from the previous step were compared and oligo in common were chosen. Candidates were filtered by running the optional kmerFilter.py script then structureCheck.py with a simulated hybridization temperature (−T) set at 42 °C. This eliminated high-abundance k-mers and secondary structures among candidate probes.

Ifpd tools v2.0.4 [45] were then used to design and select probes. Default parameters were used except for ifpd_query_set, which was run with --order centrality homogeneity size --n-oligo 1500. Probe sets were chosen based on distance between probes and from ends of scaffolds to ensure sufficient inter-probe distance for microscopic resolution. The number of probes per scaffold was set according to scaffold length and the maximal number that filled three 92 K Genscript oligo libraries. Selected 39-mers are shown in Additional file 1.

Orthogonal primer sequences were generated as described previously [45]. Briefly, the 240,000 orthogonal 25-mers designed previously [76] were each sliced into six 20-mers to generate a total of 1,440,000 20-mers. The OOD-FISH pipeline [45] was then used with default parameters but with BLAST v2.7 to align the sequences to the non-repeat masked Rangifer tarandus genome [20] with the following parameters: blastn -word_size 6 -evalue 1000 -penalty − 2 -reward 1 -task ‘blastn’ -outfmt 6 instead of BLAT. Sequences with an e-value < 25 were filtered out to isolate candidates with high-quality homology with the genome. Retained sequences were then filtered for self and hetero-dimers (SDFE and HDFE filters, respectively) and sequences with a free energy ≥ − 9 kcal/mol were kept. Candidates with 5′ GC clamp were preferably attributed to reverse primers and ones with clamp on 3′ to forward primers to promote stronger binding during amplification and transcription, others were distributed randomly. To preserve fluorophore interchangeability, forward primer and adapter sequences were fluorophore specific. Reverse primer sequences were scaffold-specific drawn from a library for individual amplification purposes. All orthogonal sequences used are listed in Additional file 2.

Genomic 39-mers homologous and primer sequences were assembled in final 79-mers and were purchased from GenScript Biotech (Piscataway, NJ, USA). They are listed in Additional file 1. Primers and detection oligo were purchased as standard desalted and HPLC purified oligo respectively from Integrated DNA technologies (IDT) and are listed in Additional file 3.

Colour attribution

Colour schemes for scaffold detection and identification were assigned avoiding scheme repetitions within a library to ensure scaffold identification within a hybridization pool. Fluorophores used for each probe are listed in Additional file 4. Colours were generated using 6-FAM, ATTO 550, or ATTO 647 N bound to the 3′ end of the fluorophore-bearing oligo (Additional file 3). To obtain additional colours (green, orange, or violet), probes were made with two adapters specific for different fluorophores.

Probe synthesis

Probes were produced as described previously [77] with slight modifications. Briefly, oligo libraries were amplified by real-time PCR using PerfeCTa SYBR Green Fastmix (Quantabio, Beverly, MA, USA) in a LightCycler 480 II (Roche, Rotkreuz, Switzerland) to analyze amplification curves. Reagents were added as described elsewhere [77]. The T7 RNA polymerase recognition site was added with reverse primers (5′ end) for in vitro transcription. Probes were generated individually in separate wells by adding the corresponding oligo library and the specific reverse primer for each scaffold. Forward primers with specific 5′ detection adapter sequences were added to the reaction mixture. The PCR product was purified using SparQ PureMag Beads (Quantabio, Beverly, MA, USA) and quantified using a NanoDrop One (ThermoFisher Scientific) according to standard (manufacturer’s) instructions.

Purified PCR products were transcribed in vitro using a HiScribe T7 High Yield RNA Synthesis Kit (New England BioLabs, Ipswich, MA, USA) according to the manufacturer’s instructions but with 1 μL of RNAseOut Recombinant Ribonuclease Inhibitor per 20 μL total reaction volume. The reaction time was set to 12–16 h in a C1000 Touch Thermal Cycler (Bio-Rad, Mississauga, ON, Canada). DNA was removed by digestion with DNase I (New England BioLabs) and RNA was purified with RNA Clean XP (Beckman Coulter, Mississauga, ON, Canada) and quantified using a NanoDrop One (ThermoFisher Scientific) according to the manufacturer’s instructions.

Purified RNA was reverse-transcribed using Maxima H Minus Reverse Transcriptase as described previously [77] with 5 μg of template RNA in 20 μL total reaction volume. Complementary DNA was purified using Zymo-Spin IC Columns (Cedarlane, Burlington, ON, Canada) and quantified using a NanoDrop One (ThermoFisher Scientific, ON, Canada) according to the manufacturer’s instructions.

Cell culture and sample preparation

Cryopreserved fibroblast cells available at the Toronto Zoo were used for karyotyping. Cells were originally derived from biopsy punches of a female caribou (Porcupine herd, Canada, Rangifer tarandus granti) and a female Eurasian tundra reindeer (Rangifer tarandus tarandus) both housed at the Toronto Zoo (Ontario, Canada). Cells were thawed and cultured at 37 °C in T-25 flasks containing Multicell DMEM/F12 medium (cat. 319–085-CL, Wisent Inc., St-Bruno, QC, Canada) supplemented with 20% fetal bovine serum (Wisent Inc) and 1% Penicillin-Streptomycin (Wisent Inc) in a humidified 5% CO2 atmosphere. Cells were harvested as described in a published protocol [78]. Briefly, culture at 80% confluence was treated with KaryoMAX™ Colcemid™ solution (cat. #15210040) for 45 minutes followed by hypotonic potassium chloride (0.075 mol/L) and Carnoy’s fixative steps. Fixed cells were dropped from a height of 15 cm onto microscope slides in a 50% humidity and room temperature atmosphere. Slides were aged for 24 h at room temperature before hybridization.

Hybridization of Oligopaint FISH probes

Probes were hybridized following a protocol described elsewhere [38] with some modifications. Briefly, slides were immersed first in 70 °C 70% formamide in 2X saline-sodium citrate buffer (SSC) for 2 min then in 70, 90 and 100% ethanol at − 20 °C for 3 min each. The 79-mers hybridization mix, containing 0.4 μmol/L of each scaffold probe in 50% formamide and 10% dextran sulfate in 2X SSC, was added (40 μL) to each slide, covered with a LifterSlip (Electron Microscopy Sciences, Hatfield, PA, United States) and hybridized for 16–18 h at 40 °C in Array Booster AB410 (Advalytix AG, Brunnthal, Germany) humidified chambers. Slides were immersed in 0.1% Tween 20 in 2X SSC at 60 °C to remove the LifterSlip, soaked for 15 min, transferred twice to fresh Tween 20 buffer at ambient temperature for 5 min each then air-dried.

For hybridization of detection oligo, 40 μL of hybridization II mix (3 μmol/L of each labelled detection oligo in 30% formamide in 2X SSC) was placed on a slide, covered with a LifterSlip and hybridized at ambient temperature for 1 h. The LifterSlip was removed using the Tween 20 buffer at ambient temperature and the slide was washed twice for 10 and 2 min (same buffer) then in 0.2X SSC for 2 min. Excess buffer was drained and True View autofluorescence quenching was carried out according to the manufacturer’s instructions (Vector Laboratories, Burlingame, CA, United States). Samples were mounted in Vectashield Vibrance antifade medium with DAPI (Vector Laboratories) and cured for at least 2 h before imaging.

Imaging and analysis

Imaging was performed on a Nikon Eclipse e600 microscope with a Nikon C-SHG1 Super High-Pressure Mercury Lamp and a 100x oil immersion Nikon objective. Filters were obtained from Nikon or Chroma, and the following excitation/emission wavelength ranges (in nm) were used: 340–380/435–485 for DAPI, 465–495/515–555 for 6-FAM, 510–560/> 570 for ATTO 550, and 625–655/665–715 for ATTO 647 N. Images were acquired using a QImaging EXi Blue CCD Camera and QCapture Pro 7 software. Image production and analysis were achieved using Fiji ImageJ [79]. Paint 3D was used for Figs. 2 and 4. Straightening tools from ImageJ were used for karyotype images of highly curved chromosomes (Fig. 2). Figure 3 was produced using the sankeyNetwork function in networkD3 package v0.2.4 [80] on R. Synteny with bovid and other cervid genomes was investigated using minimap2 [81] and visualized using the JupiterPlots bioinformatic tool on Linux [82].