Large-scale comparisons of whole genome sequences of Neanderthals and Denisovan with those of present-day humans have revealed that admixture occurred between archaic and non-African modern humans (Green et al. 2010; Reich et al. 2010; Meyer et al. 2012; Prüfer et al. 2014). The identification of the characteristic signature of adaptive introgression in the genome sequences of modern humans has been of particular interest. A study on the human leukocyte antigen (HLA, called major histocompatibility complex (MHC) in vertebrates) class I region of two archaic human genomes proposed that modern humans acquired the HLA-B*73 allele via admixture with archaic humans (Abi-Rached et al. 2011). Abi-Rached et al. (2011) assumed that the B*73 allelic lineage was translocated to the extant HLA-B locus (MHC-BI) from the distinct fictive HLA-B locus (MHC-BII) by interlocus recombination before the split of Homininae at the latest. They concluded that the ancestral population of present-day humans lost the HLA-B*73 allelic lineage after the divergence of modern humans and Denisovans, and subsequently this allele was re-introduced in modern humans through the admixture with the archaic humans. The introgression hypothesis of HLA class I gene seems to provide a plausible explanation on why the HLA-B*73 found in present-day humans is structurally divergent from other HLA-B alleles. However, the introgression hypothesis does not appear to be well grounded. First, in contrast to other studies that showed signatures of adaptive introgression from Neanderthals or Denisovans (Racimo et al. 2015), the actual introgressed segment or haplotype (i.e., HLA-B*73) has not been detected in the genome of archaic humans in the study conducted by Abi-Rached et al. (2011). Second, HLA-B*73:01—a putative allele derived from the archaic HLA-B*73—is observed in Africans as well as non-Africans. Based on the computer simulation, Abi-Rached et al. (2011) mentioned that the presence of HLA-B*73:01 in Africa was due to back migrations. However, this explanation negates an implicit assumption that introgressed haplotypes from Denisovans or Neanderthals are not shared with Africans. Third, to our knowledge, HLA-B*73 has not been detected in Melanesians, who have derived 4∼6 % of their genome from Denisovans (Reich et al. 2010). Fourth, although the 5′ and 3′ flanking regions of HLA-B*73:01 show similarity with other HLA-B alleles, it does not provide any information on the occurrence of introgression, simply because recombination can occur regardless of the introgression. Lastly, in the study conducted by Abi-Rached et al. (2011), a computer simulation assuming neutrality at the HLA-B locus was used to provide support for the introgression hypothesis; however, it is well known that the classical class I HLA loci have been subjected to strong balancing selection (Satta et al. 1994). A neutral model cannot reproduce the coalescent process of HLA alleles that have been affected by strong balancing selection (Takahata 1990). Therefore, the origin of archaic-like HLA alleles and haplotypes in modern humans remains a matter of debate. In the present study, we discuss a possibility that the HLA-B*73 lineage has been actually maintained over the course of the evolution of modern humans.

To investigate the molecular evolution of human HLA-B*73, we examined phylogenetic relationships among Homininae MHC-B genomic sequences (ca. 8 kb). For phylogenetic analysis, nucleotide sequences at the MHC-B locus in human and three non-human primates—common chimpanzee (Pan troglodytes, Patr-B), pygmy chimpanzee (Pan paniscus, Papa-B), and western gorilla (Gorilla gorilla, Gogo-B)—were obtained from the NCBI database (http://www.ncbi.nlm.nih.gov/) after removing relatively short nucleotide sequences. Of the three HLA-B*73 alleles deposited in the database, HLA-B*73:02 (GenBank accession number AY040668) and HLA-B*73var (GenBank accession number HM347714) alleles were removed from analyses due to insufficient information on the frequency and relatively long undetermined sequences. Nucleotide sequences were aligned by using the MUSCLE algorithm in MEGA v.5.2 (Tamura et al. 2011) with the complete-deletion option (insertions and deletions were not considered). Then, a maximum likelihood (ML) tree was generated based on the Kimura two-parameter model (Kimura 1980) with discrete gamma distribution among sites using nucleotide substitutions among 23 Homininae MHC-B genomic sequences composed of 8 HLA-B, 9 Patr-B, 1 Papa-B, and 5 Gogo-B (Online Resource 1). The constructed phylogenetic tree displayed that the HLA-B*73:01 allele formed a clade with Patr-B*17:01, distinct from the other HLA-B and Patr-B alleles, and at least four MHC-B alleles (HLA-B*73, Patr-B*17, Gogo-B*06, and Gogo-B*07) appeared to be a member of the distinct MHC-B allelic lineage in Homininae. It is important to note that the Patr-B*17 has at least three subordinate alleles, Patr-B*17:01, Patr-B*17:02, and Patr-B*17:03. Similar to the designation given by Abi-Rached et al. (2011), we refer to the HLA-B*73 and Patr-B*17 alleles as MHC-BII and refer to the other MHC-B alleles (with the exception of two Gogo-B alleles mentioned earlier, Gogo-B*06 and Gogo-B*07) as MHC-BI. The Gogo-B*06 and Gogo-B*07 alleles were more distant from MHC-BI alleles than MHC-BII alleles (Online Resource 1). The MHC-BII alleles shared BII-specific mutations that were not observed in MHC-BI allelic lineages (nucleotides shown in red in Online Resource 2), and the pairwise mean genetic distance (measured by Kimura two-parameter model with discrete gamma distribution among sites) between MHC-BI and BII alleles in the chimpanzee (0.04 ± 0.001) was significantly larger than that between the same alleles in the human (0.03 ± 0.001) (P value < 0.05 by Welch’s t test), supporting the presence of both, BI and BII, allelic lineages in chimpanzees. The presence of both BI and BII allelic lineages in chimpanzees demonstrated by the phylogenetic analysis and the pairwise mean genetic distance in the present study indicates that the exceptionally divergent HLA-B*73 cannot be only explained through admixture with archaic humans. Therefore, it is reasonable to expect that the B*73 allelic lineage has also been maintained in modern humans. A fundamental issue associated with the introgression hypothesis suggested by Abi-Rached et al. (2011) is that it is not clear why the HLA-B*73 allele in the Denisovan genome has not been homogenized to MHC-BI sequences by inter-allelic recombination during the archaic human evolution. Although the origin of the MHC-BII allelic lineage remains unclear, it is likely that recombination between MHC-BII and MHC-BI allelic lineages may have seldom occurred due to their dissimilar sequences or perhaps balancing selection may have maintained two distinct allelic lineages.

Next, we discuss whether the HLA-B*73 allele and some HLA class I alleles/haplotypes in present-day humans have been acquired through interbreeding with archaic humans, on the basis of the current distribution of archaic-like HLA alleles or haplotypes in present-day humans. The information on frequency distribution of HLA alleles across present-day human populations was collected from The Allele Frequency Net Database (AFND, http://www.allelefrequencies.net/, González-Galarza et al. 2015). Abi-Rached et al. (2011) reported that HLA-B*73 was the most frequently observed allele in west Asia, which was believed to be a site of admixture with Denisovans, and that this allele was rare or absent in other regions. However, the frequency of HLA-B*73 allele in west Asia (0.24 %) was lower than that in Europe and south Asia (Europe, 0.72 %; south Asia, 0.69 %; Table 1) although it is difficult to demonstrate that the difference is significant because of their low frequencies. This is inconsistent with the assumption that the highest frequency of HLA-B*73 in west Asia is an evidence of introgression of this allele (Abi-Rached et al. 2011). The frequency distribution of HLA-B*73 allele in Abi-Rached et al. (2011) appears to be estimated from the dataset derived from bone marrow donors, while the frequency distribution in the present study is based on the dataset in the AFND. Although the allele frequencies of HLA-B*73 are not shown in the previous study, in the present study, the observed number of individuals with HLA-B*73 is 3414 at a minimum in Europe, which is a population that the B*73 frequency is the highest, and in west Asia the number is 134 at a minimum (Table 1): These numbers are larger than those in the previous study (2677 in Europe and 128 in west Asia). Therefore, it is controversial whether the frequency of HLA-B*73 is the highest in west Asia.

Table 1 Allele frequencies of putative Denisovan-derived HLA-B*73 allele in The Allele Frequency Net Database

Incidentally, Abi-Rached et al. (2011) suggested that the existence of B*73 in Africa may have resulted from the back migration from west Asia to Africa; however, it seems unlikely that such a low-frequency allele could have spread rapidly worldwide after the introgression. In addition, as far as we searched, the HLA-B*73 allele has not been observed in Melanesians, even though the Melanesian genome contains the highest proportion of Denisovan ancestry in present-day human populations (Reich et al. 2010; Qin and Stoneking 2015). Therefore, the global distribution of HLA-B*73 does not appear to support the introgression hypothesis.

With respect to Neanderthals, all six Neanderthal HLA class I alleles (the Vindija Neanderthals were heterozygotes at the HLA-A, HLA-B, and HLA-C) have been reported to be identical to their corresponding alleles in present-day humans (Abi-Rached et al. 2011). According to the AFND, some archaic-like alleles are observed in Africans (e.g., 6.1 % for HLA-B*51:01 in North Africa). Even if the admixture introduced archaic HLA alleles into modern humans, it is unlikely that all alleles of Neanderthals spread rapidly to modern humans. Although Temme et al. (2014) reported that a part of the amino acid residues of DPB1*04:01 was derived from Neanderthal introgression, Ding et al. (2014) argued against the possibility because DPB1*04:01 was frequently observed even in sub-Saharan Africans. Similar to DPB1*04:01, it is difficult to state with certainty that HLA-B*73 introgressed from Denisovans into early modern humans.

HLA allelic lineages have been maintained by balancing selection for lengthy periods in evolutionary time (Takahata 1990). To examine the genetic divergence among HLA alleles, nucleotide sequence data at three HLA class I loci (HLA-A, HLA-B, and HLA-C) and three class II loci (HLA-DRB1, HLA-DQB1, and HLA-DPB1) were retrieved from the NCBI database. The number of synonymous substitutions per synonymous site (d S) for all the pairs formed by HLA coding sequences (190 HLA-A, 308 HLA-B, 218 HLA-C, 64 HLA-DRB1, 11 HLA-DQB1, and 17 HLA-DPB1) were estimated by using the modified Nei-Gojobori model (Zhang et al. 1998) with Jukes–Cantor correction (Jukes and Cantor 1969) (R = 1.14 for class I and R = 1.04 for class II), in the same manner as Yasukochi and Satta (2013), in MEGA (Tamura et al. 2011). We found that the maximum d S value (d Smax) of HLA-B alleles with HLA-B*73:01 (0.09) was similar to that of HLA-DRB1 alleles (0.09) (Table 2). Previous studies showed the existence of two distinct allelic groups (groups A and B) at a single HLA-DRB1 locus and the highly reduced genetic variation of group A allelic lineage in the common chimpanzee (Yasukochi and Satta 2014a; 2014b). These observations allow us to infer that the divergent HLA-B*73 can be maintained as a different HLA-B allelic group at a single HLA-B locus. In addition, the estimated selection coefficient (s) of HLA-B gene is the highest among the six classical HLA genes (Satta et al. 1994; Yasukochi and Satta 2013; dos Santos et al. 2015; Buhler et al. 2016). The larger the value of selection coefficient, the longer the persistence of HLA alleles (Takahata 1990). As mentioned earlier, chimpanzees also retained both MHC-BI and MHC-BII allelic lineages; therefore, it is plausible that all MHC-BII allelic lineages in Hominini have persisted for a very long evolutionary time through balancing selection.

Table 2 The d S values of pairs formed by alleles in six HLA loci

Abi-Rached et al. (2011) stated that the strong LD between HLA-B*73 and HLA-C*15:05 in populations worldwide was a signature of introgression. However, it is not surprising that a low-frequent allele such as HLA-B*73 is in strong LD with some allele. It should be noted that the strong LD is not observed only between HLA-B*73 and HLA-C*15:05 alleles. For example, in the NCBI dbMHC database (http://www.ncbi.nlm.nih.gov/projects/gv/mhc, Meyer et al. 2007), we found that HLA-B*82, a rare allelic lineage in present-day humans, also exhibited strong LD with HLA-C*03:02 in Africans (Online Resource 3). Further, HLA-B*82 was also observed outside Africa (Online Resource 4). It is striking that the putative haplotype HLA-B*35:02-HLA-C*04:01 shows a quite similar pattern to the HLA-B*73-C*15:05 (Online Resource 5): The HLA class I genotype data of individuals with HLA-B*35:02 suggests that the LD of HLA-B*35:02 with HLA-C*04:01 is globally strong, but the LD in west Asia (South-West Asia in Online Resource 5) is somewhat weaker than that outside west Asia. In addition, C*15:05 is in LD with a variety of HLA-B alleles in many populations (AFND, http://www.allelefrequencies.net/, González-Galarza et al. 2015), implying that C*15:05 has been maintained in humans for a long time. Therefore, the strong LD between HLA-B*73 and HLA-C*15:05 does not appear to provide conclusive evidence against the recent admixture with archaic humans.

Abi-Rached et al. (2011) also reported that alleles consisting of all possible combinations of Denisovan HLA-A and HLA-C (i.e., possible HLA-A-C haplotypes that are combinations of HLA-A*11 and HLA-C*12:02 or C*15 alleles) are frequently distributed in Asia and Oceania and that modern humans acquired their haplotypes via admixture with Denisovans in the relatively recent past (Abi-Rached et al. 2011). However, according to the dbMHC database, these three alleles form haplotypes with a variety of HLA-A or HLA-C and HLA-B alleles, even in Africa (e.g., HLA-C*12:02 forms haplotypes with at least 30 HLA-A and 60 HLA-B alleles). In addition, four HLA alleles (HLA-B*07, B*51, C*07:02, and C*16:02), possible members of Neanderthal-like haplotypes (Abi-Rached et al. 2011), showed a similar trend as Denisovan-like haplotypes (e.g., HLA-C*07:02 forms haplotypes with at least 60 HLA-A and 130 HLA-B alleles). These results suggest that putative archaic alleles have already existed in modern humans prior to the archaic admixture. To confirm this, we further examined whether the HLA-A*11-HLA-C*12:02 and HLA-A*11-HLA-C*15 haplotypes were transmitted from Denisovans, by using a large integrated variant dataset from the 1000 Genomes Project (http://www.1000genomes.org/, The 1000 Genomes Project Consortium 2010). The dataset was obtained via the 1000 Genomes Browser (http://browser.1000genomes.org/). The genotype data at 20,937 biallelic sites throughout approximately 1.4 Mb genomic region from the HLA-A to HLA-B genes (via the HLA-C) was used for the analysis after excluding genotype data at monoallelic and multiallelic sites among 86 individuals [28 unrelated Japanese from Tokyo, 30 unrelated Han Chinese from Beijing, 11 unrelated Utah residents with Northern and Western European ancestry (CEU), and 17 unrelated Yoruba in Ibadan, Nigeria (YRI), whose DNA samples were used in Phase I to III of the International HapMap Project]: the data of individual who was a homozygote in at least one of HLA class I genes was excluded, and one of individuals who belong to the same pedigree was selected in the CEU and YRI. The genotype data of HLA class I loci in the 86 individuals were obtained from the report of de Bakker et al. (2006). If an individual is a heterozygote with the putative Denisovan and modern HLA-A-C haplotypes, heterozygous sites of such individuals are expected to be observed more frequently than those of individuals with modern haplotypes only. However, there was no distinctive difference in the number of heterozygous sites among all individuals examined (Online Resources 6 and 7). Furthermore, we generated phased haplotypes from the genotype data of 10 individuals described above and then constructed the phylogenetic tree on the basis of 19,334 biallelic SNPs after removing monoallelic/multiallelic SNPs and indels (Online Resource 8). The tree displayed two major clades, clades A and B. In the tree, archaic HLA haplotypes are expected to form a distinct clade from modern HLA haplotypes owing to large genetic divergence under the introgression hypothesis; however, HLA alleles of an individual with the archaic and modern HLA alleles did not always form a monophyletic clade, whereas the alleles of an individual with only modern haplotypes were sometimes assigned into different clades. These results indicate that putative archaic HLA haplotypes in modern humans have not actually introgressed via the recent admixture.

In conclusion, there is no clear evidence of introgression of the Denisovan-like HLA haplotypes into modern human genomes. Balancing selection at the HLA-B locus would have maintained the HLA-B*73 allelic lineage in the direct ancestors of modern humans for a long evolutionary time. A functional analysis of protein coded by HLA-B*73 may provide the reason for the long-term persistence of this allele.