The Role of Selection and Migration in the Evolution of (Auto)Immunity Genes

The genetic architecture of multiple sclerosis is complicated. Additionally, the disease incidence varies per population or per geographical region. A recent study gives convincing explanations about the north–south incidence gradient of multiple sclerosis in Europe, by analyzing ancient and modern human genomes. Interestingly, the evidence shows that multiple sclerosis associated immunogenetic variants underwent positive selection in Asian and European populations. Lifestyle and pathogen infections probably shaped the overall multiple sclerosis risk. These results complete the findings of previous studies that showed that a high percentage of the autoimmunity associated genetic variants are under selection pressure.

The predisposition architecture of multiple sclerosis is complicated.Genome wide association studies (GWAS) have identified many genetic variants that predispose for this multifactorial disease (Patsopoulos et al. 2019).The environmental parameter is not clear; however, it is well known that Epstein-Barr virus infection plays an important role for the disease predisposition (Bjornevik et al. 2022).Genetic variants exhibiting strong predisposition for multiple sclerosis are found inside the highly polymorphic HLA genes.Interestingly, carriers of the HLA-DRB1*15:01 variant have approximately a threefold increase for multiple sclerosis risk (Greer 2015).Barrie et al. 2024, investigated the ancestry of 410,000 genomes registered in the UK Biobank (Bycroft et al. 2018), using as a reference panel 318 ancient DNA samples.They also included in their analysis 86 new genomes of ancient individuals dated from the Medieval and post-Medieval periods from Denmark, representing a typical Northern European population, plus 1664 previously published ancient genomes.They managed to cluster the ancient genomes in certain ancestries: western hunter-gatherers (WHG), eastern hunter-gatherers (EHG), Caucasus hunter-gatherers (CHG), farmers (Anatolian (ANA) + Neolithic), steppe and African.Additionally, they identified a "typical ancestral background" for each UK Biobank DNA sample.According to these data, they found that the frequency of the HLA-DRB1*15:01 variant is highest in modern populations from Finland, Sweden, and Iceland and it was high in ancient populations of steppe ancestry.Barrie et al. (2024), used special statistical tests that are based on the length of haplotypic blocks (linkage disequilibrium) to examine the genetic origin of genetic variants associated with multiple sclerosis, found in the present-day European genomes.They concluded that the present-day multiple sclerosis genetic variants originate almost exclusively from the steppe pastoralists populations, about 5000 years ago, migrated in the Northern regions of Europe.This can explain the north-south gradient of multiple sclerosis incidence that is observed today.The single nucleotide polymorphism (SNP) rs3129934 that is found in linkage disequilibrium with the HLA-DRB1*15:01 variant seems to have the largest change of genetic risk over time (selection coefficient = 0.018).For the purpose of this commentary article, it was investigated if the genetic position of rs3129934 is included in any of the candidate human genetic regions under selection, listed in the PopHumanScan database (Murga-Moreno et al. 2019).The PopHumanScan database includes candidate human genetic regions being under selective pressure, based on the data derived from the 1000 Genomes Project (Auton et al. 2015).The 1000 Genomes Project produced physical and analytical genetic maps through the genotyping of millions of SNPs in individuals coming from 26 different human populations.Interestingly, the genetic variants rs3129934 and HLA-DRB1*15:01 are found inside a genetic region with strong evidence for selection (Fig. 1).It is even more interesting that the Finnish population displays the highest evidence for selection for this region, according to the Fu and Li's D selection test (Fu and Li 1993).
There is reliable evidence that pathogen infections have contributed to the evolution of genes related with the immune system (Kerner et al. 2021(Kerner et al. , 2023)).Barrie et al. 2024, hypothesize that genetic variants that predispose for multiple sclerosis had a protective role against zoonoses, protecting the farmers for the last 10,000 years.They conclude that the north-south gradient of multiple sclerosis prevalence in Europe resulted from the combination of two events: (a) the steppe pastoralist migration in the Northern Europe, (b) the positive selection of particular HLA variants that were transferred by migrations in the Northern regions of Europe.Diseases that exhibit special geographical patterns is not something unusual in genetics.Sickle cell disease is the best-known example.The prevalence of sickle cell disease is high in geographical regions with a high frequency of Plasmodium falciparum infections, known as malaria (Kariuki and Williams 2020).The reason is that carriers of the sickle cell disease mutation are resistant for this infection.Similar explanations have been given for carriers of alpha-thalassemia, beta-thalassemia and G6PD deficiency (Kariuki and Williams 2020).
Another recent publication in Nature Communications (Pankratov et al. 2022) gives more evidence for a positive selection pressure on genetic variants that predispose for autoimmune diseases.Pankratov et al. 2022, proved that 28% of the known risk loci in 21 inflammatory diseases show evidence for weak and moderate positive selection.Part of those events seem to be population specific.In addition, they showed that the frequency of a significant percentage (19%) of the risk loci under selection has been increased because they are found in linkage disequilibrium with the actual selection targets.The function prediction of the disease risk loci under selection shows a close association with the enhancement of immunity pathways' activation.These pathways, like T cell activation and positive regulation of cytokine production, are crucial for a variety of pathogen infections.
Positive selection of disease risk genetic variants that influence the immune response is closely related with the antagonistic pleiotropy phenomenon, firstly described by George Williams (Williams 1957).George Williams was also one of the first evolutionary biologists that linked health with evolution.Today, a whole scientific field is currently expanding worldwide, known as Evolutionary Medicine (Stearns 2020;Voskarides 2020;Perry 2021).Antagonistic pleiotropy proposes that genetic variants that contribute to aging or age-related degenerative diseases are probably under selection since they increase fitness at a young age.An extended and detailed discussion on multiple examples of antagonistic pleiotropy in humans can be found in the review article by Byars and Voskarides (2020).Autoimmune diseases can of course onset at an early age, but the principle of the dual nature of genetic variants remains the same.
Concluding, contagious diseases can dramatically change the human genetic pool.An extreme example is bubonic plague (known as black death) that killed about 50% of the fourteenth century European population (Klunk et al. 2022;Hui et al. 2024).We can imagine that the survivors, the other 50%, probably carried some immunity related protective variants.Even the recent example of COVID-19 infection may have contributed to frequency alterations of some immune genetic variants.Large scale population-wide genomic studies in the near future will shed more light on these exciting aspects of population genetics.
Funding Open access funding provided by the Cyprus Libraries Consortium (CLC).

Declarations
Conflict of interest The author has no competing interests to declare that are relevant to the content of this article.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.

Fig. 1
Fig. 1 Fu and Li's D values for the 1000 Genomes Project populations (populations codes are found below), in a candidate genetic region under selection found in the HLA region of chromosome 6(see text for details).Values that significantly deviate from zero are informative about distinct demographic and/or selective events(Fu and Li 1993).The Finnish population displays the highest value.The graph has been created by the PopHumanScan database(Murga- Moreno et al. 2019).East Asia: CHB: Han Chinese, JPT: Japanese,