Introduction

The term consanguinity literally means “shared blood.” Consanguineous marriage is defined as a marriage between individuals who are closely related and is associated with an increased risk of autosomal recessive genetic diseases in the offspring of these parents [4]. More than 1.2 billion of the current population in the world are reported to practice consanguineous marriage. Consanguinity is often observed in poorly educated populations [22], and improving education allows greater independence and enables more informed life decisions. Consanguineous marriages are known to be practiced for many generations in many communities all around the world [18]. The most common form of consanguinity is between first cousin marriages. In this scenario, spouses share 1/8th of their genes inherited from their ancestors and their progeny are typically homozygous for 1/16th of all loci [19]. The prevalence of consanguinity varies from country to country (Fig. 1) and is shown to be influenced by multiple different factors such as religion, ethnicity, demography, geography (rural or urban areas), education, and economic factors [19]. Consanguineous marriages account for less than 20% to more than 50% of all marriages in Arab countries (Table 1), which span the region from North Africa to the Middle East and western Asia [12].

Fig. 1
figure 1

World map showing consanguineous marriages. Consanguineous marriages are defined here as second-degree cousins or closer, and frequency is shown in percentage (%). Image adapted from https://commons.m.wikimedia.org/wiki/File:Global_prevalence_of_consanguinity.svg and licensed under the Creative Commons Share Alike 3.0 unported

Table 1 Consanguinity rates in Arab populations

Consanguinity, unsurprisingly, has been identified as a risk for congenital malformation and major developmental medical conditions. These malformations include diverse phenotypes such as polydactyly, spinocerebellar degeneration, neural tube defects, anencephaly, and encephalocele [38]. Here, we discuss the global distribution of consanguinity and the impact of consanguinity on a wide variety of different diseases using examples of acute lymphoblastic leukemia, breast cancer, obesity, and rare genetic diseases to illustrate key messages. We also show how modern genetic sequencing techniques can inform the genetics of consanguinity with the identification of novel disease alleles, hypomorphic alleles, and founder alleles.

Global distribution of consanguinity

Recent data indicate that approximately 10.4% of the total population in the world is reported to be married to biological relatives [12]. In North Africa, West Asia, and South India, marriages to biological relatives are culturally favored and constitute 20–50% of all marriages [47]. In Qatar, it is reported that the rate of consanguinity is approximately 54% [8]. In Saudi Arabia, the rate is between 29.7 and 56% [47, 48]. In Libya, the interfamilial marriage rate was 37.6% in the city of Bengazi [1], and a study in Mauritania estimated a consanguinity rate of 47.2% [21]. In Pakistan, the rate of consanguineous rate is reported to be over 60% of marriages [43].

In European countries, South America, and Australia, the interfamily marriage rate is in comparison low [25]. In North America and Australia, the interfamily marriage is approximately 1%; in Europe, it is approximately 1.5% but depends on local geography and social conditions [27].

Consanguinity and rates of childhood malformations

The rates of early childhood malformations have been correlated with rates of consanguinity [19]. In addition, consanguineous marriage is shown to have a higher level of reproductive loss, risk of abortion, and neonatal or postnatal death [38]. However, in consanguineous populations overall there may be selection against severe recessive diseases. Many recessive genetic diseases are not compatible with life and reproduction, leading to a counter-selection of these pathogenic variants in the populations with ancient practices of consanguinity.

Consanguinity and incidence of cancer

Many studies from different research groups have indicated that rates of consanguinity have little or no effect on the incidence of cancers. Bener et al. [10] showed that although the rate of consanguinity in Qatar was high, it had no effect on the incidence of cancers overall [10]. However, at a tissue-specific level, an increase in risk for leukemia and lymphoma and colorectal and prostate cancer was shown while a reduction in breast, skin, thyroid, and female genital cancers was noted [10].

Breast cancer is the most common type of cancer in adult females. The majority of cases are sporadic, but 5–10% are reported to be inherited, with pathogenic variants in BRCA1 and BRCA2 accounting for the majority of these cases. In family genetic studies in Morocco, using next-generation sequencing (NGS) technologies, four heterozygous pathogenic variant genotypes were found: BRCA1 c.212insA and c.3453delT and BRCA2 c.1310_1313delAAGA and c.723insG [23]. The BRCA1 c.3453delT allele was novel and is likely to be a local founder allele, prompting better characterization of population-specific alleles. In studies from Arabian countries, consanguinity has been shown to be protective against breast cancer [9, 15, 28] and this may be in part due to the fact that BRCA1/BRCA2 deleterious variants are lethal in their homozygous state and are outbred from the population. There may also be an increased carrier rate of protective alleles, which may have an increased effect if present homozygously [11].

In contrast, much rarer oncogene pathogenic variants may be revealed in consanguineous populations. Ripperger et al. reported a case with constitutional mismatch repair deficiency caused by a novel MSH6 pathogenic variant leading to a T-cell lymphoma and colonic adenocarcinoma [37]. The constitutional mismatch repair deficiency syndrome (CMMRD), is an example of a rare recessive inherited cancer syndrome with a broad tumor spectrum including hematological malignancies, brain tumors, and colon cancer in childhood and adolescence. Baris et al. found a consanguineous Bedouin family, a homozygous MSH6 pathogenic variant (c.3603_3606delAGTG) [6].

Consanguinity and obesity

Obesity is known to be a risk factor for many different diseases including cardiovascular disease, insulin resistance, and type 2 diabetes mellitus. Polymorphisms in the ACE gene have been implicated in different metabolic disorders, including obesity. A recent study investigated genetic associations in the offspring of first cousins and found an association of the ACE II polymorphism with obesity in the Saudi population [4]. Also, Alharbi et al. noted that while screening for obesity in children from consanguineous parents they noted that adolescents and adults were more prone (three times more likely) to develop obesity [3]. The exact molecular mechanisms have not been explored but metabolic pathways that regulate obesity are influenced by genetic background [45], as well as environmental factors [42]. In outbred populations, only 2–5% of obesity is secondary to monogenic disorders. Interestingly, in a Pakistani inbred population pathogenic variants in monogenic genes LEP, LEPR, and MC4R were able to explain 30% of severe childhood obesity. The genetics of common obesity is more complex but studies have shown that genes associated with monogenic causes of obesity (LEPR, POMC, MC4R, BDNF, SH2B1, and PCSK1) [29, 31, 41, 49] are enriched for more common alleles in obese patients from the general population. Therefore, variants in the same genes are having different penetrance and consanguineous populations may be enriched for both rare and common genetic variants contributing to an overall increase in obesity [40].

Consanguinity and rare genetic diseases

Rare diseases are by definition those that affect a minor proportion of the population. A prevalence of <0.05% is considered to be a rare disease by the European Union, while in the USA, a disorder affecting fewer than 200,000 people is considered rare (roughly 0.086%). Rare autosomal recessive disorders are known to be increased with consanguineous parents [33]. Cockayne syndrome (CS) is a rare autosomal recessive genetic disease caused by pathogenic variants in ERCC6 or ERCC8. CS is characterized by psychomotor retardation, cerebral atrophy, microcephaly, mental retardation, sensorineural hearing loss, premature aging, kyphosis, ankyloses, and optic atrophy [51]. In a Tunisian patient from a consanguineous family, a novel homozygous variant in ERCC6 (c.3156dup; p.Arg1053Thr*8) was identified [51]. Similarly, in a consanguineous family from Jordan with a severe CS phenotype a novel frameshift ERCC6 variant (c.2911_2915del5Ins9; p.Lys971Tyrfs*14)) was found. Such findings of a rare disease diagnosis with novel pathogenic homozygous alleles in inbred populations are frequent. Studying rare diseases in families allows numerous other opportunities for genetic discoveries. Modern NGS, such as targeted panel sequencing, whole exome sequencing, and whole genome sequencing offers huge potential for molecular genetic diagnostics in these families [44]. What is interesting is that homozygous alleles predicted to be benign, such as synonymous changes within coding regions can be given disease pathogenicity if the variant is rare, segregates with disease phenotype, and is investigated at the transcriptomic level. An interesting example of such a finding is the identification of a synonymous NPHP3 allele in a consanguineous Omani family with a ciliopathy syndrome phenotype (hepatorenal fibrocystic kidney and liver disease) [32]. The NPHP3 variant (c.2805C>T; p.Gly935Gly) was initially filtered out as it was predicted to be non-pathogenic. However, the allele was exceedingly rare, within a large region of homozygosity by descent, and predicted to be pathogenic by in silico splicing tools. The allele was segregated with the disease phenotype and the identical allele was found in 4 other cases with similar phenotypes. Finally, abnormal splicing secondary to this allele was shown using RT-PCR [32]. As NGS sequencing moves from exomes to genomes there will be opportunities to identify and determine the pathogenicity of rare deep intronic alleles that may be driving rare disease phenotype. An example of this is the identification of a deep intronic allele in PKHD1 (c.8798–459C>A) leading to an antenatal presentation of ARPKD in two fetuses in a consanguineous Chinese family [14]. Consanguineous populations allow these genetic studies to be driven forward.

Founder alleles may also be identified by studying specific genetic disorders within specific inbred populations. An example is the identification of a GBA c.1246G>A; p.Gly377Ser homozygous missense variant in patients with Gaucher’s disease from Northeastern Brazil. The original population from Portugal was of Sephardic Jewish extraction and settled in this location in the 1700s. The combination of this founder allele and high rates of consanguinity contributed to a high prevalence of Gaucher’s disease in this population [13]. Other such examples are seen in other inbred populations, including the identification of a founder allele in AGL, leading to Glycogen storage disease type IIIa in Inuit populations [39].

Awareness of rare disease alleles within specific populations is now growing and premarital genetic screening for such alleles is becoming more frequent. Ashkenazi Jews have an exceptionally high carrier frequency for a range of genetic disorders named “Jewish Genetic Disorders” which include the lysosomal storage disorder Tay Sachs disease [5]. Carrier screening for these disorders is now recommended and allows informed decisions about marriage and reproduction to be taken. A recent report performed carrier screening of forty disease-causing variants in individuals from Syrian and Iranian Jewish ancestry and compared these to Ashkenazi Jewish carrier frequency rates [52]. Over 8% of the study population were carriers for at least one pathogenic variant, supporting the importance of premarital genetic screening in order to reduce the incidence of autosomal recessive disease. Such screening programs need to be adopted by other at-risk populations. In a Saudi Arabian population, the carrier frequency of variants in 35 genes associated with the most prevalent disorders was recently performed [2]. As an example, an allele, in MPL (c.317C>T; p.Pro106Leu) which causes thrombocytopenia was seen in 2.46% of the population compared to 0.01% in the gnomAD database. There are clearly issues regarding economic ethical and social implications for such screening programs and compliance of genetic screening programs can be low [46].

Within consanguineous families, the occurrence of more than one rare genetic disorder is also more frequently seen [24]. There are numerous reports of more than one rare homozygous disease-causing variant giving a combination of disease phenotypes. For example, a Chinese patient had both Wilson disease and retinitis pigmentosa and homozygous pathogenic variants were identified in ATP7B and CNGA1 accounting for the two phenotypes respectively [50]. Where concurrent inherited genetic disorders lead to overlapping phenotypes there can be diagnostic confusion. Perrault syndrome is a disorder characterized by primary ovarian insufficiency in females and sensorineural deafness in males and females. Whole exome sequencing in a consanguineous family with six deaf individuals, with the proband also having primary ovarian insufficiency, identified a pathogenic variant in CLDN14 which explained the sensorineural deafness phenotype and a SGO2 homozygous pathogenic variant explaining the concurrent ovarian insufficient [20]. Both deafness and primary ovarian insufficiency are genetically heterogeneous, and the variants were likely acting independently to produce a blended phenotype suggestive of Perrault syndrome. Caution therefore needs to be taken when researchers widen phenotypic spectra of monogenic disorders in consanguineous individuals without excluding a second homozygous disease-causing allele contributing to the disease phenotype.

A further pitfall of the investigation of a rare disease in consanguineous families is the transmission of two alleles (in heterozygous or homozygous state) within the same gene within the same family in such a way that it mimics autosomal dominant inheritance patterns. A consanguineous family of 13 individuals with variable features of Alport syndrome (including hematuria, proteinuria, and kidney failure) appeared to have a male-to-male transmission of disease pattern suggesting autosomal dominant inheritance. Genetic analysis however showed a mixture of homozygous and compound heterozygous alleles producing this pseudo-dominant transmission pattern [30]. This case is a good example of how assuming identity by descent in consanguineous families can be misleading.

NGS approaches in consanguineous families are a useful way of identifying homozygous hypomorphic alleles. In autosomal recessive diseases, these alleles typically give milder phenotypes when homozygous and more severe disease phenotypes when in trans with a heterozygous deleterious allele. Such variants have been reported in TMEM67 leading to more limited liver and kidney phenotypes rather than the embryonic lethal Meckel syndrome [34]. Homozygous hypomorph alleles in autosomal dominant diseases can be identified also, which provide exceptional cases for studying disease pathogenicity, such as the identification of a homozygous UMOD allele in a Pakistani family. These families allow a direct comparison of heterozygote and homozygote allele carriers in order to unravel gene dosage effects [16].

Whole exome and whole genome sequencing to investigate consanguineous families

With the advent of affordable NGS approaches, the use of whole exome and whole genome sequencing for the investigation of rare diseases and cancers is rapidly becoming the first line. Individual families or large cohorts of families with shared phenotypes are subject to exome or genome sequencing and results yield high diagnostic rates, add to the number of disease-causing alleles, and inform global genetics projects. Some care does need to be given before assigning pathogenicity to genetic variants in rare diseases. An example of this was the identification of a homozygous variant in CCDC28B as a potential novel genetic cause of Joubert syndrome [36]. However, further analysis of this variant showed that it was not ultra-rare and had been seen in its homozygous state in control samples from a wide range of ethnicities [7]. A similar example was the initially reported findings of a homozygous c.428delG variant in KIAA0586 in patients with Joubert syndrome. However, careful segregation and RNA studies, alongside population frequency data demonstrated that this allele on its own was not pathogenic [35].

In countries with limited resources, singleton whole exome sequencing has been advocated as a first-tier diagnostic test to limit costs [26]. This approach may be suitable to detect known alleles in known genes, but without segregation of alleles some caution, given the examples above, needs to be given to using this approach for gene discovery in consanguineous families, where homozygous alleles may be numerous and rare but not necessarily pathogenic.

Consanguinity and education

There are numerous reports correlating poor levels of education and consanguinity [17]. Women with low levels of education are more likely to be in a consanguineous marriage. Improving the education of women will allow more informed decisions based on the huge evidence base of the adverse health effects on their children resulting from a consanguineous marriage. However, there is also evidence that deep-rooted social and cultural beliefs and personal preferences outweigh improvements in education [17]. The modern advent of genetic screening and awareness of certain risk alleles within specific inbred populations allows opportunities for positive health care interventions such as premarital screening to reduce risks of inherited diseases.

Conclusions

The effects of consanguinity on health and disease are being increasingly recognized. We have used examples from cancer diseases, obesity, and rare inherited diseases to define how the effects of consanguinity need to be carefully considered. NGS approaches in consanguineous families with both common and rare disease has allowed many new gene disease discoveries. Such technologies make it much more accessible to investigate consanguineous families for inherited diseases and predisposition to other disorders such as cancers. It is important to increase knowledge and public awareness regarding the risks of consanguinity and worldwide education programs may help with this. Patients, families, and their physicians should actively engage into research on the relationship between consanguinity and disease through a multidisciplinary approach.