Objective

One of the first animals to be domesticated was the sheep. They most likely tamed Asian Mouflons (Ovis Orientalis) around 11,000 years ago (BP), possibly into the Zagros Mountains and/or southeast Anatolia, in the Fertile Crescent [1]. Sheep are farmed all over the world and contribute considerably to the production of animal-based protein used in human nutrition. They have an important contribution to the agricultural economy too. In the western region of Asia, the country of Iran, with more than 52 million sheep classified into 27 different ecotypes, can be one of the significant regions of genetic reserves in sheep. Iran is a hot and dry country due to its location on the Africa-Asia desert cordon, so about 90% of its regions are dry and water-scarce [2, 3]. Conserving variety is critical for increasing production efficiency and boosting adaptation to climate-changing conditions. The discovery of related genes, as well as the detection of effective mechanisms for responding to heat stress and immunological reactions, can increase livestock products and also help to maintain genetic diversity in these areas [4]. The data set presented here can provide useful and practical resources for studies related to animal species suitable for arid environments as well as for researching more genetic analysis-related hypotheses.

Data description

Blood was collected from three breeds of sheep native to Iran, including Karakul (4 samples), Zel (4 samples), and Kermani (2 samples) (from each of the samples, 5 ml of jugular vein blood). Karakul sheep, one of leather breeds, are native to the North Khorasan province of Iran. This province is located in the northeast of Iran and is 300 m above sea level (AMSL: 300 m). The prevailing climate of this region is hot and dry due to its proximity to the central desert of Iran. The Zel breed, tailed sheep, is mostly bred in Mazandaran province, Iran. This province is located in the north of Iran, near the Caspian Sea, and is 2 m above sea level (AMSL: 2 m). The climate of this region is divided into two types: humid and mountainous, due to the presence of the sea, mountains, and forests. Kermani sheep are bred in the southeastern parts of Iran, especially in Kerman province. This wool breed is fully compatible with the hot and dry climate of this province, which has an altitude of 1755 m above sea level (AMSL: 1755 m).

Using the salting-aut technique, the whole genome was obtained from blood samples. Then genome sequencing was done using the Illumina NovaSeq 6000 platform in China. The FastQC program was used to assess the quality of all genomic data. The samples were aligned using the Burrows-Wheeler Aligner (BWA Mem Version 0.7.10) to the sheep genome reference (https://www.ncbi.nlm.nih.gov/assembly/GCF_00274215.1/) [5]. SAM (.sam) and BAM (.bam) files were created using the SAMtools program, and SAMtools was also used for reading files, sorting them, and indexing them [6]. To limit the probability of false-positive variant calling, using the Picard toolkit, potential PCR duplicates were eliminated (http://broadinstitute.github.io/picard). To improve alignment accuracy, base quality score recalibration (BQSR) and local realignment around indels were performed using tools from the Genome Analysis Toolkit (GATK) [7]. Final variants (SNPs, single nucleotide polymorphisms) were called and filtered using the GATK program. We analyzed indigenous Iranian sheep’s genetic information using fixation index and nucleotide diversity (θπ) statistical assessments for the detection of probable genes associated with heat adaptation and immunological response, as well as to compare the genetic structure of indigenous and non-indigenous sheep populations. Our findings may help comprehend the molecular mechanisms of heat and dry climate adaptation in small ruminants [8]. The whole-genome sequencing data described in the current paper has been uploaded to the NCBI database in SRA (sequence read archive) format (https://identifiers.org/ncbi/bioproject:PRJNA904537) with the accession number PRJNA904537. For more information and data connections, please check Table 1 and the references [9,10,11,12,13,14,15,16,17,18,19].

Table 1 Summary of the whole-genome sequencing data of ten Iranian sheep

Whole-genome sequencing data was uploaded to the NCBI SRA Database with the accession number PRJNA904537. For the three different breeds of Karakul (4 individuals), Zel (4 individuals), and Kermani (2 individuals), a total of ten whole genome sequencing files were generated. This table displays the link for the bioproject, in addition to links for each sheep.

Limitations

The lack of a reference genome from Iranian sheep during alignment is one of the limitations of our study and similar studies. In generating this data, short sequences and the Illumina approach were used. But using the emerging long-read sequencing (LRS) technologies can be used to improve the quality of sequencing and increase the accuracy genome evaluation studies.