Research focused on ascertaining the effect of genetics on exercise traits has been progressively growing in the last few years, providing information about associations between some genetic polymorphisms and endurance- and power-based exercise performance, among other sports disciplines (Ahmetov and Fedotovskaya 2015; Ahmetov et al. 2021). Although the research in this field is extensive, there is still controversy and debate about the magnitude of the potential effect of genetics on sports performance and about how the influence of positive/negative heritable traits can be overcome by training and diet (Pickering et al. 2019a; Sarzynski et al. 2017; Stephan 2012). Some of this debate is based on the limitations and methodological issues of the research that presents the evidence on this topic (Breitbach et al. 2014). In this regard, studies investigating the effect of genetics on endurance and power sports performance may be confounded by the low sample sizes, incorrect categorization of elite athletes, the lack of measurement of valid exercise performance traits, and the obtaining of outcomes based only upon cross-sectional observations of genotype frequencies between athlete and non-athlete populations (Varillas Delgado et al. 2019, 2020; Varillas-Delgado et al. 2021). Additionally, elite athletes have undergone intensive training and field testing/selection prior to being studied and investigations rarely contain samples of athletes in which prior training has been standardized. In fact, in most of the studies that controlled the training stimuli, the duration of the training was short, and it was oriented to the improvement of fitness rather than to the achievement of high-performance aims (Issurin 2008, 2019). Although it is acknowledged that the standardization of training is unfeasible when studying elite athletes, particularly in studies with large sample sizes and in individual sports, the current evidence may limit the generalizations that can be generated from results of studies in the field of genetics and sports performance (Guth and Roth 2013; Bouchard 2012; Montgomery et al. 1998). As a result, the conclusions provided in the current studies remain approximate, without providing conclusive data, despite the high amount of human and economic resources invested, to clarify how genetics impact sports performance.

Interested readers are directed to review articles on the different links that some genetic polymorphisms and profiles have with sports performance (Joyner and Coyle 2008; Pickering et al. 2019a; Ostrander et al. 2009; Lippi et al. 2010; Storz et al. 2015; Tucker et al. 2013), according to phenotypes in endurance (Ahmetov et al. 2016; Ghosh and Mahajan 2016; Zhou 2014; Vellers et al. 2018; Collins et al. 2016) and power sports (Ghosh and Mahajan 2016; Eynon et al. 2013). There are also interesting and recent review articles documenting the advances made in exercise performance, fitness, and genomics (Wang et al. 2013; Pitsiladis et al. 2013; Sarzynski et al. 2016; Loos et al. 2015a).

The growing knowledge about the role of these genes in the improvement of sports performance and the advances in gene therapy make them likely targets for gene doping approaches (López et al. 2020; Baoutina et al. 2007). Several strategies are currently used in gene therapy. Adeno-associated, virus-derived vectors, lentiviral vectors and CRISPR/Cas9 gene editing are used as gene therapy approaches in several models, such as hematopoietic stem cell gene therapy (Lamsfus-Calle et al. 2020), cochlear gene therapy (György and Maguire 2018), and gene therapy targeting Schwann cells for demyelinating neuropathies (Sargiannidou et al. 2020). These rapid advances in molecular biology techniques open up many possibilities for the development of methods for gene doping detection (López et al. 2020).

However, the purpose of this review is to comprehensively appraise evidence on the impact of genetics on exercise performance to clearly determine if it supports the utility of genotyping to detect sports talent, enhance training or prevent exercise-related injuries, with a specific focus on endurance and power exercise.

Sports performance

Genetic approximations in sports performance

It is well accepted that some innate factors play a key role in sports performance and related phenotypes, such as power, strength, aerobic capacity, flexibility, coordination, and temperament (Naureen et al. 2020). Despite a relatively strong influence of heritability on the probability of becoming an elite athlete (> 70%, depending on the sports discipline) (De Moor et al. 2007) or on the values of phenotypes typically associated with performance (Simoneau and Bouchard 1995; Alonso et al. 2014), the search for genetic variants contributing to a greater predisposition to success in certain sports disciplines has been a challenging task (Holdys et al. 2013; Tucker et al. 2013; Maciejewska-Skrendo et al. 2019b). Sports genomics is a new scientific discipline focusing on the organization and functioning of genetics in elite athletes (Ahmetov and Fedotovskaya 2015). The era of sports genomics began in the 2000s with the discovery of the first genetic markers associated with sports performance; angiotensin-converting enzyme (ACE) (Jones et al. 2002), α-actinin 3 (ACTN3) (Yang et al. 2003), adenosine monophosphate deaminase 1 (AMPD1) (Rubio et al. 2005) and peroxisome proliferator-activated receptor gamma co-activator 1-alpha (PGC1α) (Lucia et al. 2005). With genotyping, sequencing and the use of the deoxyribonucleic acid (DNA) microarray analysis availability, a large number of genetic studies have been published with associations in candidate genes variants with elite athlete status (Ahmetov and Fedotovskaya 2015; Varillas Delgado et al. 2019; Pickering and Kiely 2020). To date, the affirmation that genetic variability contributes to interindividual responses to or during exercise, or to the likelihood of becoming an excellent athlete, is well supported, but when this affirmation is translated to determining which polymorphisms contribute to the phenotype of a champion, the evidence is weaker due to the low number of replication studies. Additionally, genetic predisposition is not the only factor for athletes to become champions and it is necessary to consider multiple environmental and epigenetic variables that confer the final phenotype of elite athlete (Tanisawa et al. 2020).

Several single-nucleotide polymorphisms (SNPs) have been associated with elite endurance and power performance in the last few years, and the polygenic trait is the current paradigm in elite performance, with minor contributions of each variant to the athletic phenotypes, like ACE insertion/deletion (I/D) (rs4646994), ACTN3 c.1729C > T (rs1815739), angiotensinogen (AGT) c.4072 T > C (rs699), AMPD1 c.34C > T (rs17602729), homeostatic iron regulator (HFE) c.187C > G (rs1799945) and c.845G > A (rs1800562), interleukin-6 (IL6) c.-174C > G (rs1800795), endothelial nitric oxide synthase 3 (NOS3) c.-786 T > C (rs1799983) and c.894G > T (rs1799983) have been linked to traits of elite endurance athletes. The polymorphisms in the peroxisome proliferator-activated receptor alpha (PPARα) intron 7 G > C (rs4253778), peroxisome proliferator-activated receptor gamma (PPARδ) c.294 T > C (rs2016520), polypeptide N-acetylgalactosaminyltransferase-like 6 (GALNTL6) last intron C > T (rs558129) and mitochondrial uncoupling protein 2 (UCP2) exon 4 C > T (p.Ala55Val; rs660339) have also been associated with elite performance, but the findings are less consistent (Eynon et al. 2013; Ruiz et al. 2010; Maciejewska-Skrendo et al. 2019b; Ben-Zaken et al. 2019; Díaz Ramírez et al. 2020; Moir et al. 2019). A recent review revealed that, to date, more than 69 genetic markers have been associated with power athlete status (Ahmetov et al. 2021).

A fundamental aspect that may influence the link between sports performance and genetics is the field of nutrigenomics and nutrigenetics, which are experimental approaches that use genetic testing technologies to examine the role of individual genetic differences in adapting an athlete's response to nutritional interventions (Mathers 2017; Reddy et al. 2018). This is important when assessing the influence of genetics on sports performance because an individual's dietary and supplement strategies can influence his/her sports performance (Guest et al. 2019; Thomas et al. 2016). Personalized nutrition in elite athletes based on genetic profile aims to optimize health, body composition, and sports performance (Saunders et al. 2019). Sport dietitians have been adept at implementing additional scrutiny of the dietary recommendations in all general population dietary guidelines to accommodate various sporting populations (Guest et al. 2019; Pickering and Kiely 2018). Genetic differences are known to impact absorption, metabolism, uptake, utilization and excretion of nutrients, and food bioactives may affect several metabolic pathways (Görman et al. 2013). However, to date, the investigations that consider the influence of nutritional factors when analyzing the link between genetics and exercise performance are scarce (Guest et al. 2019). In future, it will be necessary to discover the genetic modifiers of various dietary factors that impact an athlete's nutritional status, body composition and sports performance to optimize these nutritional factors.

Genetics and endurance performance

The ability to perform endurance exercise is associated with the aerobic metabolism or the capacity of using oxygen to produce energy, strongly supported by the enhanced mitochondrial function, gene expression and enzyme activity of elite athletes (Lippi et al. 2010). In a wider perspective, the capacity to perform endurance exercise is influenced by several central factors, relating to muscle and cardiovascular function (Al-Khelaifi et al. 2020; Ahmetov et al. 2015). These include the proportion of slow-twitch fibers in skeletal muscle and factors such as maximal cardiac output which underlies the maximal rate of oxygen consumption (VO2max). Regular endurance exercise training induces major adaptations in skeletal muscle, favoring metabolic consequences of the adaptations of skeletal muscle to endurance exercise. Examples are the slower utilization of muscle glycogen and blood glucose, a greater reliance on fat oxidation as a fuel during exercise, and less lactate production at a given exercise intensity, an enlargement in cardiac dimension, and an increase in blood volume. All of which favors a greater filling of the ventricles and a consequent larger stroke volume and cardiac output (Hellsten and Nyberg 2015), playing an important role in the large increase in the ability to perform prolonged strenuous exercise in response to endurance exercise training (Holloszy and Coyle 1984; Davies and Thompson 1979); but not all of them may be mainly determined by genetics (Varillas-Delgado et al. 2021). Still, it is well accepted that most endurance-related phenotypes are under strong genetic influence, with the length, duration, type, intensity and age of initiation of the training stimulus as additional contributors (Levine 2008). Some of the 40–50% variance in the proportion of slow-twitch fibers in human muscles seems genetically determined (Simoneau and Bouchard 1995), while VO2max and aerobic power also have high heritability (Moir et al. 2019; Miyamoto-Mikami et al. 2018; Zadro et al. 2017). Interestingly, gains in VO2max present a large interindividual variation even in response to standardized exercise training, and it is estimated that – 50% of the change in VO2max with aerobic training is associated with innate factors (Sarzynski et al. 2017). However, to date, the influence of genetics on other phenotypes highly associated with endurance performance such as exercise intensity or VO2 at blood lactate threshold are non-existent, despite the utility of these variables to predict endurance performance has been recognized for several decades (Brooks 1985).

Endurance athlete status remains the most-studied trait because up to 100 genetic variants have been associated with a genetic predisposition (Table 1) (Miyamoto-Mikami et al. 2018; Ahmetov et al. 2021), but still much research is needed to clearly pinpoint the genetic variants that enable better endurance performance or better adaptation to endurance training. This will be a difficult task for future years because of the complexity of all systemic and molecular phenomena associated with endurance performance (Lundby et al. 2017).

Table 1 Genetic markers for endurance athlete status (Ahmetov et al. 2021)

Genetics, endurance performance and cardiovascular disease

Several genes have been associated with endurance sports and phenotypes in terms of muscle functioning (ACTN3 and the c.1729C > T polymorphism) or blood pressure regulation (ACE and I/D polymorphism) with substantially high replication in multiple cohorts. Specifically, several studies have found that elite endurance athletes with the TT genotype in the ACTN3 or with the DD genotype in the ACE are more frequent than in control populations of untrained individuals or have higher values of variables associated to endurance exercise performance than their counterparts with TT/CT or II/ID genotypes (Ma et al. 2013; McAuley et al. 2021; Vaughan et al. 2016; Miyamoto-Mikami et al. 2018); however, other studies have not found any effect of the ACTN3 and ACE genotypes on endurance exercise performance (Papadimitriou et al. 2018; Mägi et al. 2016). Additionally, the genetic component for the observed variability of muscle fiber types in humans is ~ 40–50%, indicating that muscle fiber-type composition is determined equally by genotype and environment (Ahmetov et al. 2012). Several polymorphisms of genes involved in the calcineurin-NFAT pathway, mitochondrial biogenesis, glucose and lipid metabolism, cytoskeletal function, hypoxia and angiogenesis, and circulatory homeostasis have been associated with fiber-type composition (Ahmetov et al. 2012; Eynon et al. 2011a; Maciejewska-Skrendo et al. 2019a; Mustafina et al. 2014). Many of these gene variants in muscle have been associated with sports performance and the elite athlete status, as well as with metabolic and cardiovascular diseases (Houweling et al. 2018). Genetic variants associated with fiber-type proportions have relevant implications for the understanding of muscle performance (Mustafina et al. 2014; Hughes et al. 2011; Ahmetov et al. 2012).

VO2max is a strong prognostic factor of morbidity and mortality from all causes and, particularly, from cardiovascular disease (CVD) (Lavie et al. 2019). Numerous genetic association studies have been performed for other CVD risk factors, such as poor diet, high blood pressure and cholesterol, stress, smoking and obesity, revealing new therapeutic targets (Montefiori et al. 2018). VO2max has a large genetic component but association studies are still lacking (Bye et al. 2020). Identification of genes associated with VO2max would lead to a better understanding of this complex trait between VO2max and CVD associated with endurance performance (Bouchard et al. 2000; Rico-Sanz et al. 2004; Ahmetov et al. 2015; Al-Khelaifi et al. 2020). However, to date, studies are limited in size and employ prespecified genomic associations, limiting the discovery of new genetic markers (Bye et al. 2020). A large-scale systematic screening of genetic variants associated with directly measured VO2max in the athlete population is still needed (Bray et al. 2009). So far, the lack of studies directly measuring VO2max in high-performance athletes has limited the possibilities to associate with this phenotype.

Genetics and interindividual difference in response to aerobic training

The response to an exercise intervention is often described, with the assumption that average is a typical response for most groups of athlete populations. However, the phenomenon of “high responders” and “low responders” following a standardized training intervention may provide helpful insights into mechanisms of training adaptation and methods of individualized training prescription (Fyfe et al. 2014). Genetics makes an important contribution to individual variation in training responses, like change in body mass index (Bae et al. 2007), body composition (Duoqi et al. 2015), improvement in insulin activity for glucose utilization during exercise (Weiss et al. 2005), a favorable response to endurance training for diastolic blood pressure (DBP) and mean arterial pressure (MAP) (Jayewardene et al. 2016; Rankinen et al. 2000), and improvement of VO2max following endurance exercise training (Timmons et al. 2010). The association between genotype and training response has been supported using heritability estimates; however, recent studies have shown variation in some training responses to specific SNPs (Mann et al. 2014; He et al. 2007, 2010; Rice et al. 2012; Thomaes et al. 2011). Heritability is often expressed through the influence of innate factors on the pre-training phenotype, with some parameters showing a hereditary effect on the pre-training phenotype but not on the subsequent training response (Mann et al. 2014). Individual variation in response to standardized training that cannot be explained by genetic influences may be related to the characteristics of the training program, compliance with the program or differences in lifestyle factors among individuals (Mann et al. 2014; Ahtiainen et al. 2020; Zubair et al. 2019). Interestingly, the level of recovery between exercise sessions in endurance athletes has recently been shown to be related to liver-metabolizing genes, such as Cytochrome P450 2D6 (CYP2D6) and other genes encoding the family of G-glutathione transferases (Varillas Delgado et al. 2019), suggesting that future studies should associate genetic variants with the capacity of recovery by scavenging free radicals, or the neutralization of pH (Lewis et al. 2015). In this context, a recent study of interest by Varillas et al. (Varillas Delgado et al. 2019) presented the influence of polymorphisms in P450 cytochrome gene (i.e., c.506-1G > A polymorphism in the CYP2D6; formerly 1846G > A: rs3892097) and the glutathione S-transferase family (“null polymorphism” in GSTM, GSTP p.Ile105Val polymorphism (rs1695) and “null polymorphism” GSTP), and they found that the frequency of “favorable” variants in these genes was different among endurance athletes compared to a control population, which was confirmed by different scores in a total genotype score (TGS) model. Such information on these hepatic variables would be of interest to promote genetic scoring through various genes involved in recovery metabolism.

Some individuals may develop chronic fatigue and even maladaptation, due to imbalance between overall stress and recovery, contributing to variation in pre–post training responses (Doma et al. 2019; Mann et al. 2014). Training response can be modulated by the timing and composition of dietary intake, and hence nutritional factors could also potentially contribute to this individual variation (Close et al. 2016; van Loon and Tipton 2013). For example, acute ingestion of the phytochemical caffeine is a dietary factor that increases endurance performance with considerable interindividual differences (Guest et al. 2021). Future genetic studies that associate caffeine ergogenicity, genetics and interindividual responses may help to enhance the use of caffeine supplementation to improve sports performance in a more individualized manner (Gutiérrez-Hellín and Varillas-Delgado 2021). Finally, a certain amount of individual variation in responses to training may also be attributed to measurement error, a factor that should be accounted for wherever possible in future studies, to clarify and quantify the role of these factors with the ultimate goal of improving the prediction of training responses to endurance exercise and to provide further insight into the mechanisms of aerobic-based training adaptations (Fig. 1).

Fig. 1
figure 1

Schematic of genetic differences in power and endurance performance to identify talent for sports based on DNA testing. Knowledge of genetics facilitates gene doping, using the rapid advances in molecular biology techniques for the future development of methods to detect this doping modality. ACE angiotensin-converting enzyme, ACTN3 α-actinin 3, AMPD1 adenosine monophosphate deaminase 1, CK-MM creatine kinase isoenzyme MM, CYP2D6 cytochrome P450 family 2 subfamily D member 6, HFE homeostatic iron regulator 6, IGF1 insulin-like growth factor 1, IL6 interleukin 6, NOS3 endothelial nitric oxide synthase 3, PPARγ Peroxisome proliferator-activated receptor gamma, UCP2 uncoupling protein 2, VEGFA vascular endothelial growth factor A

Genetics and power performance

Genetics strongly influences the ability of skeletal muscle to produce force at a high velocity, crucial for success in power and sprint-based sports (Eynon et al. 2013; Ruiz et al. 2010).

To date, the genetic profile of elite athletes still remains poorly characterized, and it is likely that each SNP makes a limited contribution to an elite athletic phenotype (Ahmetov et al. 2016). The genetic profile in elite-level power/sprint performance is different from that of elite endurance performance (Eynon et al. 2011b; Maciejewska-Skrendo et al. 2019b; Buxens et al. 2011). The metabolic and biochemical phenotypes are polar opposites to perform in endurance or power/sprints endurance events (Eynon et al. 2013). Elite endurance performance requires sustained muscular contraction over a long period of time, using a high VO2max and other traditional endurance capabilities involved in the mitochondrial respiratory chain (Miyamoto-Mikami et al. 2018; Hagberg et al. 2001; Greggio et al. 2017). However, short-distance sprint and power performance require high speed and forceful muscle contraction, depending on the anaerobic pathways, using intramuscular stores of creatine phosphate (CP) and adenosine triphosphate (ATP) as the main substrate for energy production (Spencer and Gastin 2001; Praagh 2007; Sousa et al. 2017).

Recent studies have investigated the link of the myostatin gene (MSTN) with exercise performance. Within the sequence of this gen, the p.K153R polymorphism has been deemed as a genetic variant to influence skeletal muscle phenotypes as the MSTN RR variant is more frequent top level endurance runners than in national-level counterparts (Ben-Zaken et al. 2015c). Additionally, the combination of the R allele in the MSTN gene and T allele in the interferon gamma-1 (IGF-1) c.-1245C > T (rs35767) polymorphism The IGF c.1245C > T (rs35767) has been associated with higher circulating IGF-I levels, greater muscle mass and improved endurance performance (Ben-Zaken et al. 2017). Also, the MSTN c.163G > A (p (rs180565) polymorphism plays an important role for the stability of MSTN propeptide inhibitory activity (Ferrell et al. 1999), showing significantly different allele frequencies in Caucasians and African Americans, making it possible to explain the difference in muscle mass and strength among power athletes in these ethnicities. These allelic variants in MSTN provide markers for examining the association between the myostatin gene and interindividual variation in muscle mass and differences in loss of muscle mass with aging (Ferrell et al. 1999). The MSTN polymorphism could be a potential candidate that can positively affect the skeletal muscle phenotype after power exercise. However, previous myostatin SNPs studies mainly observed chronic effects such as muscle hypertrophy after exercise related to ethnicity (Li et al. 2014; Kostek et al. 2009), and there was no study about acute response such as muscle damage.

The genetic influence on elite power/sprint performance has received less scientific attention unlike the genetics of endurance performance. Only a few studies have characterized the associations between genetic variants and elite power/sprint performance (Table 2) (Ruiz et al. 2010; Eynon et al. 2013; Ben-Zaken et al. 2015b, 2019; Ahmetov et al. 2016, 2021; Pickering et al. 2019b). Several studies have reported that muscle power and muscle strength are important for power/sprint performance, involving a strong genetic influence (Calvo et al. 2002; Thomis et al. 1998a, b; Suchomel et al. 2018).

Table 2 Genetic markers for power athlete status (Ahmetov et al. 2021)

A previous study by Ruiz et al. (Ruiz et al. 2010) using six polymorphisms in candidate genes (ACE, ACTN3, AGT, GDF-8, IL6, and NOS3) and adding the influence of all variants in a TGS model, identified a polygenic profile to distinguish elite power athletes from both endurance athletes and nonathletic individuals, reducing the possibility of finding individuals with a “perfect” TGS, while increasing the predictive accuracy of the model. Moreover, it must be kept in mind that, beyond genetic endowment or complex gene–gene and gene–environment interactions, there are other numerous contributors to the “complex trait” of being an athletic champion, such as motivation, socioeconomic factors, or simply opportunity (Ruiz et al. 2010). Hence, the possession of a high TGS for power performance does not guarantee success in power-based sports.

The inter-individual difference is the ergogenic response to acute caffeine intake has been linked to variants in the cytochrome P450 1A2 (CYP1A2) gene, specifically to the c.-163C > A (rs762551) polymorphism. Although there is evidence to support that athletes with the AA genotype in this gene obtain greater benefits from acute caffeine intake (Guest et al. 2018), most evidence suggests that AA and C-allele carriers equally benefit from caffeine supplementation in several exercise and sports scenarios (Del Coso et al. 2012; Muñoz et al. 2020; Puente et al. 2018). However, to date, most evidence suggests that the ergogenic effect of caffeine on neuromuscular performance may be independent of the CYP1A2 genotype (Barreto et al. 2021).

Genetics and muscle performance

Mammalian skeletal muscle is an extremely heterogeneous tissue which is essential for its function. Differences in the mechanical and energetic properties of isolated mammalian slow-twitch and fast-twitch muscle fibers have been well documented (Lippi et al. 2010; Schiaffino and Reggiani 2011). Slow-twitch fibers are slower, less powerful and more economical in terms of energy at force generation than fast-twitch muscle fibers. Furthermore, the peak efficiency of slow-twitch muscle fibers occurs at slower shortening speeds than that of fast-twitch muscle fibers (Bottinelli and Reggiani 2000). This heterogeneity covers all possible aspects of muscle contractile function and is directed at optimizing the contractile responses and performing different motor tasks, minimizing fatigue. Endurance capacity has been related to a predominance of slow-twitch fibers (> 50%) (Wilson et al. 2012), whereas fast-twitch fibers are related to power and speed capacity (Fitts et al. 1991). Power and sprint athletes have a high proportion of fast-twitch muscle fibers with low oxidative capacity compared to endurance athletes (Bergh et al. 1978).

Several polymorphisms in target genes codifying muscle proteins have been linked to muscle performance: The creatine kinase isoenzyme MM (CK-MM) gene encodes the cytosolic muscle isoform of creatine kinase responsible for the rapid regeneration of ATP during intensive muscle contraction and plays a vital role in the energy homeostasis of muscle cells (Echegaray and Rivera 2001). Within the numerous polymorphisms located in the CK-MM gene (Chen et al. 2017), the c.*800A > G polymorphism (rs8111989), located in the 3 'untranslated region of the gene has been previously associated with the status of elite athlete (Eider et al. 2015; Lucía et al. 2005; Rivera et al. 1997b; Chen et al. 2017). Overall, the distribution of the AA genotype and the A allele is more frequent in elite athletes (Grealy et al. 2015; Zhou et al. 2006), while the A allele can even affect the increase in aerobic power induced by endurance training (Rivera et al. 1997a). Further investigation is warranted, in this is as a creatine kinase and is a central controller of cellular energy homeostasis. In this case, the study of the influence of variants in the CK-MM gene on exercise performance should be amplified by investigating other forms of creatine kinase as the mitochondrial isoenzyme of creatine kinase (MtCK) (Schlattner et al. 2006). Specifically, because MtCK interferes with the regulation of oxidative phosphorylation, as well as mitochondrial permeability transition (Calbet et al. 2020) and polymorphisms in the gene codifying this protein may produce different response to situations of energy stress (e.g., endurance exercise).

The ACTN3 gene codifies α-actinin-3, a component of the contractile machinery in fast skeletal muscle fibers in mammals with a highly conserved sequence (Eynon et al. 2013; Lippi et al. 2010). The ACTN3 RR variant is nearly always present in elite power athletes, whereas homozygosity in the 577X allele, associated with a premature stop codon that produces complete α-actinin-3 deficiency, is prevalent in elite endurance athletes, such as marathon runners and rowers (Yang et al. 2003; Shang et al. 2010; Gutiérrez-Hellín et al. 2021; Ben-Zaken et al. 2015a). The ACTN3 XX variant shows reduced fast fiber diameter, increased activity of multiple enzymes in the aerobic metabolic pathway, altered contractile properties and enhanced recovery from fatigue (Broos et al. 2012). The presence of α-actinin-3 has a globally beneficial effect on the function of skeletal muscle in generating forceful contractions at high velocity, related to increased sprint performance. Independent studies, however, failed to demonstrate a significant association between this ACTN3 polymorphism and ultra-endurance performance (Lucia et al. 2006; Houweling et al. 2018; Baltazar-Martins et al. 2020).

Myosin light chain kinase, a calcium-calmodulin-dependent multi-functional enzyme, plays a critical role in the regulation of smooth muscle contraction (Del Coso et al. 2016). Polymorphisms in the gene that codifies the myosin light chain kinase (MYLK) (C49T and C37885A) have been associated to the level of exercise induced muscle damage. CA heterozygotes for the C37885A polymorphism presented lower reductions in leg muscle power and lower increases in serum creatine kinase after a competitive marathon than CA runners (Del Coso et al. 2016). However, just the contrary has been also suggested with CA individuals presenting higher markers of muscle damage than CC after elbow flexion eccentric exercise (Clarkson et al. 2005). Also, individuals who were TT homozygotes for C49T had a significantly greater increase in creatine kinase and myoglobin after exercise when compared with CT and CC counterparts (Huang et al. 2018; Shen et al. 2015). Collectively, it seems that the variations in the sequence of the MYLK gene may contribute to skeletal muscle’s properties to endure muscle contraction. Still, more research is needed to warrant the identification of MYLK variants that produce beneficial muscle phenotypes and to determine if exercise training may overcome the potential influence of the variants in the MYLK gene on muscle performance.

The ACE gene has two alleles: the deletion (D) allele of the human ACE is habitually associated with higher activity of the enzyme angiotensin-converting enzyme than the insertion (I) allele in both tissue (Yan et al. 2018) and serum (Zhang et al. 2017). This genetic variation might be associated with many heritable traits, including skill parameters, and physiological and sports performance (Moran et al. 2006; Scott et al. 2005), showing an increased frequency of the ACE I allele in elite endurance athletes (Myerson et al. 1999; Nazarov et al. 2001; Amir et al. 2007) and increased frequency of the ACE D allele in elite sprint and power performance (Myerson et al. 1999; Costa et al. 2009). The mechanism underlying the association of the D allele with power-oriented, anaerobic sports is probably mediated through differences in skeletal muscle strength gain, since a greater training-related increase in quadriceps muscle strength has been associated with the D allele, while the I allele may influence endurance performance through improvements in substrate delivery and the efficiency of skeletal muscle, with subsequent conservation of energy stores (Moran et al. 2006; Nazarov et al. 2001; Lippi et al. 2010).

The AMP deaminase 1 gene is an important regulator of energy metabolism in the muscle fiber that catalyzes the irreversible deamination of adenosine monophosphate (AMP) to inosine monophosphate (IMP) and is one component of the purine nucleotide cycle (Van den Berghe et al. 1992). Subjects with the TT genotype in c.34C > T in the gene that codifies this protein (AMPD1) have diminished exercise capacity and cardiorespiratory responses to exercise when compared to C-allele carriers (Varillas Delgado et al. 2020; Rico-Sanz et al. 2003). Moreover, carriers of the T allele have a limited training response of ventilatory phenotypes during maximal exercise (Rico-Sanz et al. 2003; Rubio et al. 2005; Gineviciene et al. 2014) and a reduced submaximal aerobic capacity (Gineviciene et al. 2014; Lucia et al. 2009).

The combination of various polymorphisms has recently been studied to help in the understanding of muscle performance. In this case, several studies show the association of ACE and ACTN3 is a better explanation of muscle performance. A recent study by Wagle et al. (Wagle et al. 2021) suggested the potential combined influence of ACTN3 RR and ACE DD polymorphisms on isometric and dynamic strength testing. This study may serve as a framework to generate hypotheses regarding the effect of genetics on performance, suggesting that specific genetic profiles in these polymorphisms might influence human physical performance (Ma et al. 2013; Papadimitriou et al. 2016, 2018; Eynon et al. 2009). This current knowledge of ACE and ACTN3 should be expanded in future studies of cohorts of athletes with the other genes involved in muscle performance; CK-MM, MYLK and AMPD1 using TGS methodology to show the additive or synergistic influence of these polymorphisms in sports performance in both endurance and power modalities.

Gene doping

Athletes have long been using substances to improve their sports performance (De Rose 2008; Cantelmo et al. 2020). The use of doping substances had the purpose of boosting performance in various parts of the ancient world, and some strategies persist nowadays (Verroken 2000; Palmi et al. 2019; La Gerche and Brosnan 2018). With the evolution of knowledge about the organism’s physiological dynamics, new ways have emerged of achieving enhanced performance through gene manipulation. There is the possibility of manipulating human genetic material and regulating gene expression to increase or decrease the production of certain enzymes and other proteins associated with processes that are key for human performance (Cantelmo et al. 2020; Unal and Ozer Unal 2004). The genetic material could be manipulated using normal or genetically modified cells including transferring nucleic acid polymers or their analogues (Brzeziańska et al. 2014; Heuberger and Cohen 2019). This gene manipulation arose with the advent of gene therapy, which has been practiced and improved for the treatment of severe human diseases over the last 30 years (Baoutina et al. 2013).

Increased athletic performance through genetic manipulation could be achieved with the development of the clustered regularly interspaced short palindromic repeats with DNA endonuclease Cas9 (CRISPR/Cas9) methodology (Pokrywka et al. 2013). To restrain this practice, the World Anti-Doping Agency (WADA) added “gene doping” to the “Prohibited List” in 2004. However, the 2018 version expanded the gene doping ban by adding to the list of banned substances and methods any gene editing agents designed to alter genome sequences and/or the transcriptional or epigenetic regulation of gene expression. Nevertheless, to date, it is unclear how WADA will enforce such rules (Thevis et al. 2019) or if it has the capacity to detect any type of gene modification with the current methodology for obtaining samples.

Gene therapy was defined by Haisma et al 2006 as the transfer of genetic material to human cells for treatment or prevention of a disease. In this regard, the genetic materials that can be transferred are DNA, ribonucleic acid (RNA) or genetically altered cells (Haisma and de Hon 2006). This therapy still poses numerous limitations and risks, such as the patient’s immune response and consequent treatment rejection (Haisma and de Hon 2006; Gaffney and Parisotto 2007; Cantelmo et al. 2020). To date, no athlete has yet been formally accused of using gene therapy as a form of doping, maybe because there is no official confirmatory test for this therapy, which is still in the early stages of development (Paßreiter et al. 2020). Indeed, until recently, there has been no efficient gene-editing system until the discovery of the CRISPR technique (Jinek et al. 2012), which shows severe side effects related to gene therapy (van der Gronde et al. 2013). To date, we can only speculate which genes would be the best candidates for gene doping according to their potential to boost sports performance, as in the case of those that encode erythropoietin (EPO), myostatin blockers, the insulin-like growth factor (IGF-1), growth hormone (GH), vascular endothelial growth factor (VEGF), fibroblast growth factor (FGF), ACTN3, cytosolic phosphoenolpyruvate carboxykinase (PEPKC), and peroxisome proliferator-activated receptor-δ (PPARδ) (Cantelmo et al. 2020; Azzazy et al. 2005; Haisma and de Hon 2006; Hakimi et al. 2007; van der Gronde et al. 2013).

Of special interest is the success of the BNT162b2 mRNA-based vaccine against the SARS-CoV-2 virus, a promising recent development in the regulation of transgene expression (Wang et al. 2020) with ramifications to other research fields. In fact, there is evidence of the success of viral EPO gene transfer using adenovirus, adeno-associated virus, and retrovirus in rodents and non-human primates (Neuberger et al. 2012; Segura et al. 2007) with the purpose of increasing red blood cell production and hemoglobin concentration to improve endurance performance, providing experimental data to support the potential use of gene therapy employing adenovirus or retrovirus vectors as an effective and difficult to control form of gene doping.

Both the scientific and the sports communities have yet to reach a consensus regarding banning or regulating gene therapy in sports and, consequently, gene doping. Much still needs to be studied, developed, and discussed, but what seems to be decisive to resolve this question is to enforce some limits. Defining which conduct should be discouraged or regulated, elaborating effective and applicable rules in the scientific environment, and being able to detect undesirable behaviors with the purpose of exposing and punishing those responsible for these doping methods, are policies that still need to be developed in terms of gene doping (Solomon et al. 2009; Cantelmo et al. 2020).

The future in genetics and sports performance

Most current knowledge in sports genetics has been generated from candidate gene analysis using small sample sizes (typically, a few hundred) (Wang et al. 2013), presenting new candidate genes every year in various physiological pathways that increase knowledge of sports performance (Varillas Delgado et al. 2019). Candidate gene association studies have often produced inconclusive results. Therefore, the vast majority of the candidate genes for sporting performance discovered to date are not strongly associated with phenotypes of interest (Wang et al. 2013), except for a few exceptions, such as ACTN3 and ACE. Genetics and other innate factors represent just a portion of the likelihood of becoming an excellent athlete, but at the same time, genetics provides useful insights, as sports performance can be ultimately defined as polygenic traits intimately associated with the characteristics of the sport (Lippi et al. 2010). It is commonly accepted that there will be many genes involved in sports performance (Ahmetov and Fedotovskaya 2015; Ahmetov et al. 2016).

Exercise training regulates the expression of genes encoding various enzymes in muscle and other tissues, and genetic research in sports will help to clarify several aspects of human biology and physiology, such as RNA- and protein-level regulation under specific circumstances (Lippi et al. 2010). Although an individual’s potential for excelling in endurance or power performance can be partly predicted on specific genetic variants, the contribution of complex gene–gene interactions, environmental factors and epigenetic mechanisms are also important contributors to the “complex trait” of being an athletic champion (Buxens et al. 2011). A combination of the fields of genomics, epigenomics and transcriptomics along with improved bioinformatic tools, in addition to precise phenotyping, is required for future research to understand the inter-relations of exercise physiology, sports performance and susceptibility to diseases (Ehlert et al. 2013; Eynon et al. 2011b).

DNA polymorphisms (1% frequency or greater) (Takahashi and Tajima 2017) and rare DNA mutations (< 1% frequency) (Sloan et al. 2018) can generally be classified as genetic markers associated with endurance, power and strength (or combined power/strength). It should be noted that other possible athlete statuses involving coordination and flexibility have still not been studied. Although most of the genetic variants associated with sports performance seem to be implicated in musculoskeletal and cardiopulmonary functions, genes associated with enhanced functions of the central nervous system (CNS) may also contribute to elite athletic traits (Kitazawa et al. 2021). The significance of a particular sport-related genetic marker is based on several criteria, such as the type of the polymorphism (stop loss/gain, frameshift, missense, synonymous, 3′/5′-UTR, intronic, non-coding RNA, 5′/3′-near gene, intergenic, etc.) (Ergun and Oztuzcu 2016; Calabrese et al. 2020; Hon et al. 2017; Ahmetov et al. 2016).

Despite the obvious role of genetics in athletic performance, there is little unequivocal evidence in support of a specific genetic variant with a major effect on a relevant sports performance phenotype. This is certain at least across the normal range of human trait distributions because sports performance phenotypes are complex traits and fundamentally polygenic (numerous genes with small effects) (Loos et al. 2015b; Sarzynski et al. 2016; Muniesa et al. 2010; Lippi et al. 2010), or because researchers failed to take into consideration the full range of environmental effects, or both (Rice et al. 1997; Aaltonen et al. 2020; Ghosh and Mahajan 2016). It is very important to note that each DNA locus can probably explain a very small proportion of the phenotypic variance (0.1–1%) (Timmons et al. 2010; Lippi et al. 2010). Therefore, very large sample sizes are needed to detect associations, and various combinatorial approaches should be used.

Considering the low predictive capacity of findings with SNPs, researchers have turned to TGS, whereby several elite athlete-associated SNPs are combined into a single polygenic score. This method was developed by (Williams and Folland 2008). (Ruiz et al. 2010; Ruiz et al. 2009) and recently Varillas et al. (Varillas Delgado et al. 2019, 2020) utilized this approach with elite endurance and power athletes. Furthermore, whilst a TGS may help in discriminating between athlete and non-athlete populations, Santiago et al. (Santiago et al. 2010) demonstrated that, in a group of rowers, it did not distinguish between different levels of performance (i.e., World vs. National medalists). An earlier study by (Hughes et al. 2011) demonstrates that there is considerable similarity in polygenic scores in humans (athletes and non-athletes alike) when a low number of SNPs are used, such that again, this approach would likely have limited real world specificity and sensitivity. Lastly, the configuration of TGS to differentiate between populations of athletes usually confers the same score on each SNP while the individual influence of each gene on the status of elite athlete is likely uneven, with some genetic variants exerting a greater influence than others. For this reason, the weighted-total genotype score (w-TGS) model presented by (Varillas Delgado et al. 2020) is a score of genes involved in energy and iron metabolism, suggesting that the relative weighting of the influence of each SNP may add further value in the development of increasingly accurate TGS. As such, further work should seek to identify additional SNPs associated with elite athlete status across a variety of sporting phenotypes and replicate existing SNPs (Pickering et al. 2019a). New genotyping/sequencing technologies are developing rapidly, allowing a more thorough search across the whole human genome (Ng and Kirkness 2010; Boulygina et al. 2020). The genome-wide association studies (GWAS) represent a promising and productive way to study sports-related genotypes in the future (Boulygina et al. 2020; Al-Khelaifi et al. 2020; Ahmetov et al. 2015). This new approach involves rapidly scanning > 5 million markers in complete sets of DNA, using microchips from many individuals to find DNA polymorphisms associated with a particular trait (Ahmetov et al. 2016; Dehghan 2018; Flanagan 2015). One of the advantages of GWAS is that it is unbiased concerning genomic structure and previous knowledge of the trait (hypothesis-free), in contrast to candidate gene studies, where knowledge of the trait is used to identify candidate loci contributing to the trait of interest (Ahmetov et al. 2016; Wang et al. 2013; Al-Khelaifi et al. 2020; Flanagan 2015). However, the use of hypothesis-free models in GWAS contributes to the identification of genetic variants linked to exercise and endurance performance with no mechanistic bases to support how the variant may modify a specific trait. Despite this limitation, GWAS facilitated by high-throughput genotyping technologies have been enormously successful in identifying single-nucleotide polymorphisms (SNPs) that are associated with complex traits (Gallagher and Chen-Plotkin 2018; Visscher et al. 2017). For this reason, GWAS studies should be able to aid in resolving the low predictive capacity of current genetic research (Al-Khelaifi et al. 2019; Eynon et al. 2011b). The systematic review present by Ahmetov et al. shows 41 markers identified in the last 2 years by performing GWAS on African American, Jamaican, Japanese, and Russian athletes, indicating that GWAS represent a promising and productive way to study sports-related genotypes (Ahmetov et al. 2016). Of note, 31 genetic markers have shown positive associations with athlete status in at least 2 studies and 12 of them in 3 or more studies. Conversely, the significance of 29 markers was not replicated in at least 1 study, raising the possibility that several findings might be false positives. These results open up a path for future research including multicenter GWAS and whole-genome sequencing in large cohorts of athletes with further validation and replication, contributing to the discovery of large numbers of the causal genetic variants (mutations and DNA polymorphisms) that would partly explain the heritability of athlete status and related phenotypes (Al-Khelaifi et al. 2020; Ahmetov et al. 2015).

To date, there are serious and well-placed concerns about the use of genetic information, either alone or in combination with existing measures, for talent identification in sport (Webborn et al. 2015; Guth and Roth 2013). It is generally considered that, in sporting contexts, genetic testing should not be carried out on athletes younger than 18 years of age (Pickering et al. 2019a; Webborn et al. 2015). According to the current state of knowledge, no child or young athlete should be exposed to direct-to-consumer (DTC) genetic testing to define or alter training or for talent identification aimed at selecting gifted children or adolescents. Large-scale collaborative projects may help to develop a stronger scientific foundation on these issues in the future. There remains a lack of universally accepted guidelines and legislation for DTC testing regarding all forms of genetic testing and not just for talent identification (Webborn et al. 2015). It is advocated that talent identification and development programs should be dynamic and interconnected taking into consideration maturity status and the potential to develop rather than excluding children at an early age, and more representative real-world tasks should be developed and employed in a multidimensional design to increase the efficacy of talent identification and development programs (Vaeyens et al. 2008).

According to a recent review by Joyner (Joyner 2019), we are still a long way from developing a complete understanding of the genetic influence on sports performance, and in the future, we might be able to model it using field tests or even with data from real competitions, which could be more predictive of an athlete’s performance that current approaches by SNP and TGS studies. Limitations in the methodologies used in studies of genetic association and sports performance could be overcome using GWAS with larger sample sizes and by increasing the internal and external validity of the associations between genetics and sport-specific phenotypes. In addition, the use of algorithms through machine learning-based model for prediction, as in the detection of diseases (Heo et al. 2019; Kalafi et al. 2019; Uddin et al. 2019), would give us answers to the doubts that still exists in the field of genetics and sports performance for the detection of talents based on DNA testing. Still, it is important to mention that the use of DNA testing will never outperform the prediction capacity of a field test, but it can be used together with field and laboratory testing to enhance the likelihood of detecting sports talents or athletes more prone to injury.


This narrative review discusses the current evidence on the impact of genetics on endurance- and power-based exercise performance to clearly determine the utility of genotyping to detect sports talent, enhance training or prevent exercise-related injuries. The comprehensiveness of this review is aided by the explanation of the most common methods used to assess the influence of genetics on sports performance. From an applied perspective, the ideas discussed in this review suggest that talent/sports performance phenotype identification based on DNA testing is likely to be of limited value at present, and that field testing, which is essentially a higher-order ‘bioassay’, is likely to remain a key element of talent identification in both the near and foreseeable future (Webborn et al. 2015). Nevertheless, the rhythm of inclusion of new technologies to assess DNA variants, the reduction of costs for genetic testing and the cooperation of research groups to produce large-scale sample sizes, will help to enhance the predictive capacity of genetic testing in the future, at least in sports where a clear endurance vs. power phenotype is strongly associated with overall performance.