Background

The worldwide prevalence of chronic diseases, such as cardiovascular disease, cancers, stroke and diabetes is rising [1]. Low cardiorespiratory fitness is strongly associated with chronic diseases and premature mortality [2,3,4,5,6,7]. To alleviate the health and economic burden associated with low cardiorespiratory fitness, health guidelines across the world recommend individuals undertake regular exercise [1].

Exercise training can increase cardiorespiratory fitness and decrease chronic disease via a number of mechanisms [7]. Adaptations include improvements to cardiac size, stroke volume (increase in volume of blood pumped from the left ventricle), cardiac output (volume of blood pumped from the heart per minute), pulmonary blood flow and respiratory function, supply of oxygen-rich blood to working muscles (increased number of capillaries and blood volume), muscle mitochondrial function and content, oxidative enzyme capacity, vascular wall health and function, and biomechanical efficiency [2, 7]. It has been suggested that improvements in cardiorespiratory fitness in response to exercise training varies greatly between individuals, with some people responding well or very well (‘responders’ or ‘high-responders’) to exercise training, whereas others only have mild increases in their cardiorespiratory fitness following similar exercise training (‘low-responders’) [4, 5, 8,9,10,11]. Importantly, these responses need to be compared to within-subject random variation to ascertain true inter-individual differences [12]. The ability to change cardiorespiratory fitness is a multifactorial trait influenced by environmental factors (such as exercise training) and genetic factors [4, 5, 11]. Considering cardiorespiratory fitness is one of the best integrative predictors of morbidity and mortality risk, it may be important to understand how genetics predict the variability in response to exercise training. This knowledge could lead to targeted personalised exercise therapy to decrease the burden of chronic disease.

The gold standard measure for cardiorespiratory fitness is maximal oxygen uptake (VO2max), which is quantified as the maximal amount of oxygen the body can use in 1 min, during dynamic work with large muscle mass [13]. Research into human variation of VO2max was first undertaken over forty years ago, with several authors identifying a strong genetic influence on VO2max in twins [14, 15]. Subsequent studies have identified significant familial aggregation for VO2max trainability. For example, authors have found greater variance between pairs of monozygotic (MZ; identical) twins than within pairs of twins for VO2max training response after standardized aerobic training interventions [16, 17]. The strongest evidence to date on this topic was found in the HEalth, Risk factors, exercise training And GEnetics (HERITAGE) family study [18]. Four hundred seventy-three Caucasian adults from 99 nuclear families completed 20 weeks of Moderate Intensity Continuous Training (MICT). The average increase in VO2max was 400 mL O2/min, with a range from − 114 to + 1097 mL/min. This difference was two and half times greater between families than within families, with a 47% heritability estimate for VO2max training response [18]. A major limitation from these findings, however, is there was no comparator control group.

Since this familial longitudinal research, the Human Genome Project completed sequencing of the human genome resulting in significant advancements in genetic analysis capabilities. This led to a better understanding of genetic variations of large populations. Analyzing genetic variants on a population level using techniques such as candidate gene analysis, GWAS, whole genome and exome sequencing and RNA expression analysis (RNA-seq, or microarrays) has resulted in the possibility of developing ‘personalized genomics’. This aims for biological profiling to provide more effective health management and treatment [5]. However, research in the field of exercise genomics it still in its infancy and much work is needed before genomic tools could be utilized to personalize exercise training programs [19].

The aim of this study was to systematically review the literature and identify genetic variants that have been associated with VO2max trainability following an aerobic exercise training intervention. Given the infancy of this research field, results should only be used to provide the basis for future research. This research should aim to confirm previous findings and investigate mediators that can influence gene expression. Importantly, future genetic studies in this area should attempt to investigate the physiological functions that contribute to improving VO2max training response and overall health outcomes. Findings from ongoing research may assist clinical professionals to provide personalized evidenced-based medicine centered on phenotype, contributing to the fight against chronic disease.

Methods

A comprehensive search of four databases (PubMed, Embase, Cinahl, Cochrane) was completed from their inception until October 2016. Studies focusing on genes and their VO2max/VO2peak response to supervised aerobic training were sought with the following search terms: genetic profiling, polymorphism, single nucleotide polymorphisms, SNPs, genetic variants, predictor genes, trainability, endurance training, cardiovascular fitness, cardiorespiratory fitness, VO2max, VO2peak, aerobic power, aerobic fitness, aerobic capacity. A full list of search terms can be found at the end of this review.

Two authors (CW and JC) agreed on the criteria for inclusion. Articles were incorporated if they were: original, peer-reviewed research; included an aerobic intervention, with minimum 75% supervision; included genetic variant testing; included a maximal VO2max/peak using direct gas analysis from an incremental test (pre and post intervention); conducted on humans; and written in English.

Using an extraction grid, one author (CW) conducted the initial screening analysis. After removing duplicates and scanning the titles and abstract of articles, those meeting the inclusion criteria were reviewed. Data recorded from the review consisted of the author’s name and place of study, study design, study sample, tissue source, genotyping method used, gene and variant examined, genotype, gene expression (if examined), intervention used, possible mediators (such as medications and health concerns), and the influence of the genetic variant investigated on VO2max change. Further articles were retrieved from snowballing included articles from their reference lists. Articles included in the review are in Table 1.

Table 1 Summary of included articles

A summary of key findings from the included articles is provided in Tables 2 and 3. Limitations were assessed by two authors (CW and JC) based on the intervention, genotyping method used, study design and sample used. Table 4 was developed to highlight which predictor genes for VO2max trainability merited further exploration. A third author (MW) examined Tables 1, 2, 3 and 4 to ensure all genetic variants, genomic coordinates and genotypes, were described with a consistent annotation.

Table 2 Summary of findings from candidate gene studies
Table 3 Summary of hypothesis-free studies
Table 4 Predictor genes that may influence VO2max training response

Results

Of the 1635 articles identified, 35 met the inclusion criteria (see Fig. 1). A summary of these articles is provided in Tables 1, 2 and 3. From the 35 articles, 97 genetic variants were identified as being significantly associated with VO2max trainability (Table 4).

Fig. 1
figure 1

PRISMA flow chart of article selection process

Study characteristics

Across the studies DNA samples from 4212 individuals were used. Tissue sources were predominantly blood leucocytes, lymphoblastoid cell lines and buccal cells. Genotype was primarily identified through PCR-RFLP (polymerase chain reaction restriction fragment length polymorphism based analysis) for candidate genes and Illumina Human CV370-Quad Bead Chips for GWAS analysis (which can capture over 370,000 SNPs per participant).

Overall, 68% of participants in the reviewed studies were men, and ages ranged from 17 to 75 years. The average BMI of participants was 25.3 kg/m2 (SD 2.36). Where detailed, DNA samples were taken from a variety of ethnicities, including Caucasian (74.5%), Asian (13.5%), African-American (7.5%), Hispanic (4.3%) and Native American (0.2%).

The 35 included articles described 15 cohorts, with three cohorts providing subject data for 19 articles (see Table 1 for details). Nine articles [20,21,22,23,24,25,26,27,28] used data from the HERITAGE study and five [29,30,31,32,33] reviewed Caucasian participant data from the Cardiac Rehabilitation and Genetics of Exercise Performance and Training Effect (CARAGENE) study. Five studies examined clinical data from 102 young male and apparently healthy police recruits in China [34,35,36,37,38]. The remaining samples came from independent clinical studies focusing on apparently healthy but sedentary adults from a variety of ethnicities including Caucasians, Asians, African-Americans, Native American and Hispanics [13, 39,40,41,42,43,44,45,46,47,48,49,50,51,52,53].

Most reviewed studies (n = 32) used a single-group longitudinal design. However, one study compared three groups using a longitudinal design [28]. One study used retrospective data from two Randomized Controlled Trials (RCT) [20]; and one was a double-blind study [39].

Twenty-eight studies examined a MICT intervention. Two studies examined protocols using High Intensity Interval Training (HIIT) [28, 40]. The 5 remaining studies trained participants by running at Ventilatory Threshold (VT) [34,35,36,37,38]. Training intensity was measured using a percentage of VO2max, Heart Rate Reserve (HRR), VT, Maximal Power (Pmax) or Maximum Heart Rate (HRmax). Intensities varied between 50 and 85% VO2max, 95% -105% VT, 50–85% Pmax, 80–85% HRR and 50–80% HRmax. Training volume varied between 20 to 90 min per session (2-4×/week). The period of interventions ranged from 4 weeks to 9 months. Training modalities consisted primarily of cycle ergometers and treadmills.

Only six studies incorporated a standardized diet prior to and during the intervention period [23, 41,42,43,44,45]. Three articles included strength training [20, 39, 47] and two studies included military training [39, 47] as the intervention.

Genotyping findings

  1. 1.

    Candidate gene studies

The candidate gene association approach requires a prior hypothesis that the genetic polymorphisms of interest are causal variants or in strong linkage disequilibrium (LD) with a causal variant, and would be associated with a particular exercise-related phenotype at a significantly different rate than predicted by chance alone (may be higher or lower). This approach is effective in detecting genetic variants that are either directly causative, or belong to a shared haplotype that is causative [54]. Thirty-two candidate gene studies were based on the gene’s molecular function and possible association with VO2max trainability (Table 2).

Genes associated with muscular subsystems

VO2peak can be influenced by muscle efficiency and it has been hypothesized that genes encoding muscular subsystems may contribute to the genetic variability in VO2peak training response [33]. Twelve genes and 21 genetic variants related to muscular phenotypes were investigated in 935 (76 female) cardiac patients from the CARAGENE study [33]. Three out of the 21 genetic variants were significantly associated (p < 0.05) with an increase in VO2peak following 3 months of MICT (2–3 × 90-min sessions per week at 80% HRmax; p < 0.05). These variants included GR:c.68 > A (G/A genotype, number of people with genotype; n = 55) in the glucocorticoid receptor gene (GR; rs6190), CNTF:c.115-6G > A (AA genotype, n = 21) in the ciliary neurotrophic factor gene (CNTF; rs1800169) and the AMPD1:c.133C wild type (CC genotype, n = 652) of the adenosine monophosphate deaminase gene (AMPD1; rs17602729). Furthermore, a larger change in relative VO2peak was reported in patients with a greater number of these variants described (Area Under the Curve (AUC): 0.63; 95% Confidence Interval (CI): 0.56–0.7; p < 0.01). More specifically, those with a gene predictor score (GPS) of one or less positive response alleles had an average increase in VO2peak of 16.7%. Those with four or more positive response alleles had an average increase of 25%, with each positive response allele contributing approximately 1% (13.5 mL/min) to the increase in VO2peak.

Caucasians aged between 17 and 65 years from the HERITAGE study who were homozygous (TT genotype) for the AMPD1:c.133C > T (p.(Gln45*)) (rs17602729) variant (n = 6), had a lower VO2max training response (<121 mL/min; p = 0.006), compared to the CT and CC genotypes (n = 497) following 20 weeks of MICT (3 × 50 min per week at 55–75% HRmax) [46].

The serine/threonine protein kinase 1 (AKT1) gene has been linked to growth and skeletal muscle differentiation [44]. In a study of 109 Caucasians (50–75 years old), men (n = 22) with the AKT1:c.-350G > T (rs1130214) variant (TT/GT genotype) significantly increased their VO2max compared to men (n = 29) with the GG genotype (fold increase of 1.2 ± 0.02 vs 1.1 ± 0.02, p = 0.037) following 24 weeks of MICT (3 × 20–40 min per week at 50–75% HRR) [44].

The glutathione S-transferase P1 (GSTP1) c.313A > G variant has been associated with an impaired ability to remove excess reactive oxygen species. This is hypothesised to increase the exercise training response by better activation of cell signalling pathways resulting in positive muscle adaptations [45]. While investigating 62 Polish females’ (19–24 years-old) response to 12 weeks of MICT (3 × 60 min per week at 50–75% HRmax), participants (n = 30) with the GSTP1:c.313A > G (GG + GA genotype) demonstrated a 2 mL/kg/min greater improvement in VO2max compared to AA genotypes (n = 5) following training (absolute p = 0.029, relative p = 0.026, effect size = 0.06) [45].

Genes associated with electrolyte balance

The electrogenic transmembrane ATPase (NA+/K + −ATPase) gene may contribute to VO2max trainability by affecting the electrolyte balance and membrane excitability in working muscles [24]. Examining Caucasian data from the HERITAGE study, it was found that those homozygous for a recurrent 3.3-kb deletion in the exon 1 of the ATP1A2 gene (n = 5) had a 41% (45 mL/min) lower training response compared to heterozygotes (n = 87) [24]. This exon encodes on part (alpha-2-subunit) of the Na+/K + ATPase protein. This genotype also had a 48% (197 mL/min) lower VO2max training response than homozygotes (n = 380) for a repeated 8.8-kb in the exon 1 of the ATP1A2 gene following 20 weeks of MICT (p = 0.018) [24]. VO2max gains were 29% (130 mL/min) and 39% (160 mL/min) greater in offspring homozygous for a 10.5-kb deletion in exon 21–22 (n = 14) compared to heterozygotes (n = 93) and homozygotes (n = 187) respectively (p = 0.017) [24].

The angiotensin-converting enzyme (ACE) gene contributes to blood pressure, fluid and salt balance [55]. Elite endurance athletes are more likely to have the Insertion (I) allele [56] which relates to lower ACE activity and reduced blood pressure response during exercise, whereas sprint/power athletes are more likely to have the Deletion (D) allele and the DD genotype [57] and subsequently higher ACE activity. Caucasians from the CARAGENE study with the homozygous II genotype (frequency of 0.23 and 0.18 for men and women respectively) had a 2.1% greater VO2max training response (p = 0.047) compared to the DD genotype (frequency of 0.3 and 0.36 for men and women respectively) [31]. When eliminating those on ACE inhibitors, the improvement increased by 3% (p = 0.013) [31]. On the other hand, VO2max trainability was 14–38% greater (p = 0.042) in HERITAGE Caucasian offspring with the DD genotype (n = 81) [25]. Three studies found no association with ACE or angiotensinogen genetic variants and VO2max training response in 53 Caucasians (average age 19 years) following 12 weeks of military training [47]; 147 multi-ethnic 19–24 year-old adults following 8 weeks of military training [39]; and 83 Brazilian policemen (average age 26 years) following 17 weeks of MICT (3 × 60 min per week at 50–85% VO2peak) [48].

Genes associated with lipid metabolism

Genotypes of the perilipin (PLIN1) gene may influence training response via intracellular lipolysis and energy production [43]. In 101 Caucasians (50–75 years old), there were no significant differences between carriers and non-carriers of the PLIN1:c.504 T > A variant (rs1052700) after 24 weeks of MICT (20–40 min, 3 × per week) [43].

The peroxisome proliferator activated receptor delta (PPARD) gene affects fatty acid oxidation and energy production [22]. African-Americans (n = 19) from the HERITAGE study with the PPARD exon 4 + 15 (CC genotype) had a significantly lower VO2max training response (> 50 mL/min lower; p = 0.028) and power output (> 15 W lower; p = 0.005) compared to the C/T and TT genotypes (n = 230) [22].

Apolipoprotein E (APOE) variants affect the level of lipids in the blood, cell lipid uptake and endothelial vascular dilation [23]. APOE has 3 common alleles: E2 (TT/TT), E3 (TT/CC), E4 (CC/CC) at two SNPs (rs429358, rs7412), which can create six possible genotypes (E2/E2, E3/E3, E4/E4, E2/E3, E2/E4, E3/E4) [58]. The APOE E4 allele has been associated with Alzheimer’s disease [59], higher levels of low density cholesterol (LDL-C) and a greater risk of coronary heart disease compared to E3 (wild-type) and E2 carriers [23]. Chinese men (18–40 years) with the APOE E2/E3 (n = 20) and E3/E4 (n = 31) genotypes had a significantly higher VO2max training response (Odds Ratio (OR) = 0.68 (95% CI (0.04, 1.32); p = 0.04 and OR = 0.60 (95% CI (0.09, 1.11); p = 0.02 respectively) compared to other APOE genotypes following 6 months of progressive MICT (3 x per week at 60–85% VO2max) [13]. Similarly, Chinese women (18–40 years) with the APOE E2/E3 (n = 25) and E3/E4 (n = 29) genotypes had significantly higher VO2max training responses compared to other APOE genotypes (OR = 0.62 (95% CI = 0.05, 1.18); p = 0.03 and OR = 0.62(95% CI = 0.09,1.15); p = 0.02 respectively) [13]. Men and women (ethnicity unknown) with the E3/E3 APOE genotype (n = 43) had an 8% lower training response compared to the E2/E3 (n = 40) and E3/E4 genotypes (n = 37) (p < 0.01, Bonferroni-corrected) following 6 months of MICT (4 × 50 min per week at 60–85% VO2max) [42]. However, there was no significant difference in the VO2max training response between APOE genotypes in men and women from the HERITAGE study (n = 766) [23]. Similarly, in 51 males (40–80 years old, ethnicity not confirmed) there was no difference in VO2max training response between genotypes [41].

Genes associated with oxidative phosphorylation and energy production

Mitochondrial DNA (mtDNA) encodes several enzyme subunits involved in oxidative phosphorylation, and may be a key factor in endurance and cardiorespiratory fitness [56]. Research of mtDNA variants in 41 inactive Japanese men (mean age 20.6) failed to find a significant difference in trainability after 8 weeks of MICT (3–4 × 60 min per week at 70% VO2max) [49]. On the contrary, 3 men (17–25 years) with the mtDNA variant in subunit 5 of ND5 had a lower VO2max training response compared to other mtDNA variants (~ gain 0.22 L/min less, p < 0.05) following 12-weeks of MICT (3–5 × 45 min per week at 85%HRRmax) [50].

The creatine kinase muscle (CKM) gene has been associated with reduced fatigue from increased adenosine triphosphate (ADP) production [26, 27]. Using data from the HERITAGE study, parents and offspring homozygote for the 1170 bp allele (n = 12) had a lower VO2max training response (3 times and 1.5 times lower respectively; p < 0.05) compared to other CKM genotypes (n = 148). This explained 9 and 10% of the inter-individual variation in VO2max change respectively [26]. A nominal genetic linkage was identified in siblings (n = 277) who shared two alleles (1170 base pairs or 985 + 185 base pairs) at the CKM locus identical by descent (IBD), with these siblings having similar changes in VO2max compared to siblings with fewer alleles IBD (p = 0.04) [27]. In an earlier study focusing on muscle specific inherited variations, no association was found in 295 Caucasians (18–30 years old) between CKM or adenylate kinase (AK1) variants after a randomized control trial that included 15 weeks of endurance training versus maximal power contraction interval training [40]. Similarly, no association was found with the CKM gene and VO2max trainability in 937 Caucasian patients with coronary artery disease following 3 months of MICT (2–3 × 90 min aerobic sessions per week at 80% HRmax) [29].

Nuclear respiratory factor 1 (NRF1) and nuclear factor (erythroid-derived 2)-like 2 (NFE2L2) [36, 37], contribute to mitochondrial biogenesis and oxidative phosphorylation [60]. In a study involving 102 physically active Chinese male soldiers (average age 19 years), there was no association between NRF1 and NFE2L2 genotypes or haplotypes and VO2max trainability after 18 weeks of 3 × 5000 m runs per week at 95–105% VT [36, 37].

Genes associated with oxygen delivery

Nitric oxide causes coronary and arterial vasodilation, contributing to oxygen delivery regulation [32]. Data from the CARAGENE study was used to investigate genes associated with nitric oxide bioavailability [32]. These included nitric oxide synthase 3 (NOS3), cytochrome b-245 alpha chain (CYBA, also known as p22-PHOX), glutathione peroxidase (GPX1), catalase (CAT), superoxide dismutase 3 (SOD3), vascular endothelial growth factor A (VEGFA), peroxisome proliferator-activated receptor alpha (PPARα) and peroxisome proliferator-activated receptor gamma coactivator-related 1 (PPARC1) [32]. Participants carrying the C allele of the CAT:c.262 T > C variant (n = 342) had up to 3.1% greater improvements in VO2max training response compared to participants with the TT genotype (n = 521) following MICT (f = 3.6; p = 0.02). Participants with the NOS3 1.4 haplotype combinations (n = 36) had a 6.4% lower training response compared to the 3.3. haplotype combinations (n = 133) (p < 0.05). However, these associations were not significant after Bonferroni correction. No other associations were found with other genes or haplotypes related to nitric oxide availability and endothelial function [32]. Similarly, in a cohort of 80 Portuguese (20–35 years old) police recruits, there was no association between NOS3 genotypes (−786 TT/TC/CC, 894 GT/TT/GG) and VO2peak response following 18 weeks of 3 × 80-min per week of graded running training [59]. Additionally, no association was found with PPARGC1 and VO2max trainability in 102 Chinese male polices recruits following MICT [36].

The beta-2-adrenergic receptor (ADBR2) gene helps to support oxygen delivery to working muscles via the adrenergic receptors [30]. In participants from the CARAGENE study, there was no association found between ADBR2 genotypes or haplotypes, and VO2max trainability [30].

The hypoxia-inducible factor 1 alpha (HIF1A) gene is a transcriptional regulator that controls angiogenesis (blood vessel development) and metabolism by increasing the expression of hypoxia-induced genes, such as VEGF [52]. Caucasians 60 years and over with the H1F1A:c.1744C > T (rs11549465; C/T genotype; n = 37) had a significantly lower training response (0.3 mL/kg/min; p = 0.03) compared to those with the CC genotype (n = 64) following 24 weeks of MICT (3 × 20–40 min per week at 50–70% VO2max) [52].

The 5′-aminolevulinate synthase 2 (ALAS2) gene is highly expressed in erythroid cells and is imperative for hemoglobin and myoglobin synthesis [53]. Seventy-two Chinese participants (18–22 years old) allocated to one of 13 ALAS2 genotypes with compound dinucleotide repeats lengths (157 bp −184 bp), were placed in a 4-week ‘HiHiLo’ training program (varying between low and high altitude training at 75% VO2max) [53]. Baseline hemoglobin levels and change in VO2max with training was significantly higher in subjects (n = 25) with the dinucleotide repeats ≤ 166 bp (p < 0.05). No significant associations were found between VO2max trainability and other genes related to oxygen transport and utilization genotypes in 102 young Chinese soldiers following 18 weeks of 3 × 5000 m runs per week [35, 37, 38]. These genes include mitochondrial transcription factor A (TFAM) [35] and hemoglobin-beta locus (HBB) [38].

  1. 2.

    Hypotheses free studies

Over the last decade, with the advent of technological advances allowing researchers to genotype millions of genetic variants (e.g. SNPs) in each individual, the investigation of the contribution of common variants to traits is now feasible. Unbiased and hypothesis-free genome wide association studies (GWAS) for exercise/health-related traits have emerged.

Three studies have used GWAS to identify genes associated with the VO2max response to exercise training [20, 21 28]. These are outlined in Table 3.

The first investigated two clinical trials and data from the HERITAGE study [28]. RNA expression profiling and VO2max testing was performed on 24 healthy and inactive Caucasian men (average age 24 years) before and after a 6-week training intervention (4 × 45-min cycling sessions per week at 70% VO2max). Muscle biopsies from the vastus lateralis were collected and the RNA expression of genes was correlated with changes in VO2max by analysing oligonucleotide arrays. Pearson correlations were used to identify the relationships between the median logit normalised probe sets and the number of times they were selected. In the 24 subjects, using a median correlation cut-off greater than 0.3, 29 genes were selected greater than 22 out of 24 times. The sum of expression of these 29 genes were found to have a significant linear relationship with VO2max change following endurance training (r 2 = 0.58, p < 0.00001). Across the group, VO2max changes improved on average by 14% and ranged from −2.8% to 27.5% (p = 0.0001). More than 20% of the group had a response less than 5%. A gene set enrichment analysis found that the oxidative phosphorylation gene was upregulated (False Discovery Rate (FDR) = 1.1%), which was associated with an increased reliance on lipids during training (RER decreased on average by 10% post training, p < 0.0001). To identify if these predictor genes would be similar in a different sample, a 12-week blind study on 17 young and active Caucasian men was conducted. Training consisted of 1-day of testing, 2 sessions of interval training (3 × 3-min intervals at 40–85% Pmax) and 2 × 60–120-min cycle sessions (55–60% Pmax) each week. The 29 predictor genes were also significantly associated with VO2max trainability in this group (p = 0.02). The haplotypes of these predictor genes were then genotyped using candidate genes identified from the HERITAGE study. Six genetic variants were associated with VO2max trainability: SMTNL2, DEPDC6, SLC22A3, METTL3, ID3 and BTNL9 (p < 0.01 each). A stepwise regression model using 25 variants from the predictor set and 10 variants from the HERTIAGE study (Table 3) found that eleven SNPs (included in Table 4) contributed to 23% of the differences seen in residual VO2 max gains, which correlated to approximately 50% of the genetic variability in VO2max trainability (seven variants from the RNA predictor set and four from the HERITAGE project). Reciprocal RNA expression validation found that three of four HERITAGE candidate genes enhanced the original RNA transcript predictor model. Overall, more than 90% of gene expression did not change. However, OCT3 was downregulated in high responders and H19 was upregulated in low responders (FDR <5%). BTNL9, KLF4 and SMTNL2 also had small but inconsistent changes in expression (i.e. dissimilar in high vs low responders) (FDR < 5%).

A GWAS examining 324,611 variants from the HERITAGE study was completed to identify possible predictor genes associated with VO2peak [20]. Based on single-variant analysis, 39 variants (Table 3) were associated with gains in VO2peak although none of these achieved genome-wide or suggestive significance (p = 1.5 × 10−4) [19]. The strongest predictor for training response was found in the Acyl-CoA synthetase long-chain family member 1 (ACSL1) gene (4:g.185725416A > G; rs6552828) which accounted for 7% of the training response (p = 1.31 × 10−6). After a stepwise multiple regression analysis of the thirty-nine variants, 21 were suggested to account for (or at least contribute to) 49% of the variance in VO2max trainability (included in Table 4; p < 0.05). The strongest predictors were found in SNPs associated with: PR domain-containing protein 1 (PRDM1); glutamate receptor, ionotropic, N-methyl-D-aspartate 3A (GRIN3A); N-methyl-D-aspartate receptor (NMDA); potassium voltage-gated channel subfamily H member 8 (KCNH8); zinc finger protein of cerebellum 4 (ZIC4); and, ACSL1. An unweighted ‘predictor score’ based on contribution to VO2max of these 21 variants was created. A score of ‘0’ represented homozygote for the low-response variant; ‘1’ represented heterozygous and ‘2’ represented homozygous for the high-response allele. Individuals with a score equal to or less than 9 (n = 36) had an average VO2max score improvement of 221 mL O2/min. Alternatively, those (n = 52) with a score equal to or greater than 19 had an average VO2max increase of 604 mL/min.

The 15 most significant variants were tested for replication in a sample of African-Americans from the HERITAGE study, women in the Dose Response to Exercise (DREW) study (n = 112), and the men and women in the Study of a Targeted Risk Reduction Intervention through Defined Exercises (STRRIDE) (n = 183) [20]. Variants in the NDN (15:g.24008071 T > C; rs824205) and DAAM1 (14:g.59477414C > T; rs1956197) were replicated in the DREW study, the Z1C4 (3:g.146957166 T > C T; rs11715829) variant was replicated in the STRRIDE study and CAMTA1 (7:g.7015105 T > C; rs884736) and RGS18 (1:g.192059022G > A; rs10921078) variants were replicated in African-Americans from the HERITAGE study. Four variants in the genes supervillin (SVIL), neuropillin 2 (NRP2), titin (TTN) and carbozypeptidase (CPVL) identified by Timmons et al. [28] were also found by Bouchard et al. [20], however, at a significance of 0.008, these variants were not included in the multi-variate regression analysis.

Using the HERITAGE cohort, an extended analysis was performed, with 2.5 million variants analysed [21]. To reduce bias associated with outlier variants, the second most significant variant p-value was used to determine genotype and changes in VO2max. Even with an extended analysis, the ACSL1 gene was shown to have the most significant variant (4:g.185725416A > G; rs6552828), which confirmed findings by Bouchard et al. [20], whom identified the most significant variant at each gene (Table 3). The following genes and their variants were also replicated in both studies: CAMTA1 (rs884736), RYR2 (rs7531957), g.63226200G > A (rs6090314), C12orf36 (rs12580476) and CD44 (rs353625) [20, 21].

The gene prioritisation tool ‘CANDID’ was then used to rank candidate genes for changes in VO2max [21]. This was done via: 1) a weighted analysis based on variant gene expression in targeted tissues; 2) GWAS p-value change in VO2max; 3) literature related to candidate genes; and 4) ‘cross species sequence conservation’ [21]. The top-ranking candidate genes from the GWAS and CANDID tool (Table 1) were then investigated for possible biological mechanisms and changes in VO2max. As a result, variants were allocated into four groups: 1) broad effects on exercise-related processes (such as the electron transport chain, physical fitness, skeletal development and other cardiorespiratory markers); 2) moderately strong scores against selective exercise-related processes; 3) high and low scores across several exercise-related processes; 4) low scores across all exercise-related processes.

Variants and their involvement in pathways related to changes in VO2max response were then examined [21]. Out of the sixteen pathways found, variants related to pantothenate and co-enzyme A (CoA) biosynthesis, PPAR gene signalling and immune function signalling had the highest level of ‘burden’ (variants contributing to trainability). The variants related to long-chain fatty acid transport (including ACSL1) and fatty acid oxidation strongly influence VO2max training response via lipid metabolism process and the tricarboxylic acid cycle, both of which affect the availability of adenosine triphosphate and subsequently training response.

Predictor genes

Out of the 35 articles analysed (candidate genes and GWAS studies), 97 predictor genes were identified as possible contributors to VO2max trainability (Table 4). These genes were based on what authors deemed significant, or the most significant, for their particular study. Thirteen of these predictor genes were replicated between at least two studies (bolded in Table 4). The traits for VO2max trainability (e.g. which genotype was related to the training effect and whether it was a low or high responding genotype) was not outlined for each variant and hence this will require confirmation in future studies.

Discussion

This systematic review aimed to summarize genetic variants that have been identified as influencing VO2max trainability. We have reviewed 35 studies that have reported 97 genes associated with an exercise training-induced improvement in VO2max. It has been estimated that VO2max trainability has a significant heritable component of around 50% [39].

There were several studies that identified the same variant, including: the lipid-related ACSL1:c.-32-716 T > C (rs6552828) [20, 21] and skeletal muscle-related AMPD1:c.133C > T [33, 46]; intra-cellular calcium regulator RYR2:c.6166 + 552 T > G; cellular function-related CD44 (rs3653625), transcriptional activator CAMTA1 (rs884736), non-coding C12orf36 (rs12580476) and apoptotic regulator 20:g.63226200G > A (rs6090314) [20, 21]. Additionally, Bouchard et al. [20] were able to replicate the variants in genes from the HERITAGE study, including: growth suppressor NDN, cell cortex function-related DAAM1, development-related Z1C4 and signal transduction inhibitor RGS18. Numerous identified variants were found in pathways that contribute to training response (e.g. calcium signaling, immune function, angiogenesis, mitochondrial biogenesis) with pathways and associated SNPs possibly influencing each other and overall trainability [21]. Several articles found conflicting results with electrolyte balance, lipid production and energy production genes ACE [25, 31, 47, 48], APOE [13, 23, 41, 42], mtDNA [49, 50] and CKMM variants respectively [26, 27, 29, 40]. All other ‘predictor genes’ identified are yet to be replicated.

While most of the articles examined in this review have focused on one or a few candidate genes/markers (n = 32), it is noted that exercise-related phenotypes are complex traits and are polygenic (i.e. influenced by many genes working together) with each genetic variant likely to be contributing a small percentage (typically less than 1%) to the overall change in VO2max [33, 39, 61]. Thus relying on one variant as a predictor is misguided; rather it has been suggested that a gene predictor score (GPS) based on numerous variants has a greater probability to determine higher and lower responders for VO2max trainability. For example, a score of ‘0’ represents a homozygote for a low-response variant; ‘1’ represents heterozygous and ‘2’ represents homozygous for a high-response variant [20]. A higher score indicates a greater possible VO2max training response (and vice versa). A similar model has been suggested in elite athletes aiming to determine the probability of an individual with a theoretically ‘optimal’ polygenic profile for endurance sports. The ‘optimal’ profile using a so-called ‘total genotype score’ (TGS, ranging from 0 to 100, with ‘0’ and ‘100’ being the worst and best genotype combinations, respectively) was quantified from a simple algorithm resulting from the combination of candidate polymorphisms [62, 63].

These predictor genes, along with muscle RNA and protein expression data provide a sound platform to further explore the cellular mechanisms underlying VO2max trainability. Further research will need to consider several limitations identified from the literature to-date. For example, the lack of replication found between articles and conflicting results with certain variants, may be a result of several main limitations (typically in study design). Firstly, most of the articles used a hypotheses-driven candidate gene approach (n = 32), several articles used retrospective data from similar cohorts (n = 19), and many lacked a control group and randomization (n = 31). While it is understandable that in the past, high-throughput SNP microarray or gene sequencing technology was not available to use, by looking at one or only a few gene variants (whereas it is estimated that the human genome consists of about 40 million common gene variants) it is almost impossible to generate meaningful information. Similarly, a lack of control group makes it challenging to distinguish between individual response to an intervention and within-subject random variation [64]. Secondly, most of the exercise training studies involve a relatively small number of participants (typically n = 20 to 30; with the exception of the HERITAGE and CARAGENE studies), which results in lack of statistical power when associating genotype with a phenotype. Many of the studies also failed to include a robust significance criterion (p < 0.05 occurs approximately 106 times in the genome by chance). Thirdly, a lack of racial diversity (74.5% Caucasian) further reduces the power of variants detected. Finally, many of the training studies were not tightly controlled in terms of nutrition, participant baseline data (study entry), physical activity status and other lifestyle factors.

Future research needs to consider epigenetic variation of gene activity that can occur in reaction to external factors, such as additional physical activity, drugs, diet and environmental toxins [61, 65]. Such epigenetic modifications can affect all adaptions to exercise training [10]. For example, in addition to nutrition and baseline physical activity status, there were many other differences in subjects between articles not taken into consideration including: age, training duration and volume (MICT vs. HIIT), body weight, body fat percentage, medications, clinical versus healthy populations; sleep, psychological status and the gut microbiome. Together, these are potential epigenetic modifiers (e.g. DNA methylation and histone acetylation) that can influence gene expression, molecular function and thereby influence VO2max training response [61, 66]. Whether genes or epigenetic modifiers play a larger percentage role in adaptive variability in a specific situation requires further exploration.

To address these limitations, larger-scale studies are required to ascertain if the 97 predictor genes identified from this review are similar in various cohorts (e.g. several ethnicities, ages, gender). The Athlome Project Consortium, which includes the Gene SMART study, is an example of a current larger-scale investigation examining ‘omic markers’ of training response, elite performance and injury rates/predisposition in variety of populations [67]. Ideally, future studies will complement and expand on this research, and consider alternative forms of exercise training intensity and volume, lifestyle factors, general health, diet, medications and health history when implementing interventions and analyzing data.

Furthermore, the role of the gut microbiome, and its influence on metabolism and physiology, needs to be explored. For example, gut microbiota (which has its own genome) can interact with the tissue cellular environment to regulate gene expression [61]. Poor diet, stress, illness, the use of antibiotics, environmental toxins and poor lifestyle choices can increase inflammation within the gut, causing dysbiosis; this appears to contribute to chronic diseases and other illnesses, irrespective of genotype, age and gender [68, 69]. Interestingly, VO2max was recently shown to be related to gut microbial diversity in a human cross-sectional study [70], suggesting a link between VO2max and gut microbes. Pre- and probiotics, resistant starch and a Mediterranean diet (dietary diversification) can alter the gut microbiome [68]. Investigating how the gut and human genome interact to positively influence VO2max is warranted.

With these points in mind, the analysis of stool samples, in addition to incorporating epigenetic, transcription and proteomic analysis, may help to identify the best aerobic training or lifestyle intervention to upregulate or downregulate certain genes, signaling pathways and molecular responses required for a greater VO2max training response. Implementing tightly-controlled studies examining various mediators (training intervention, diet, lifestyle) and molecular biomarkers across various populations will help to capture accurate information related to ideal traits for VO2max trainability.

Conclusion

In total, 97 genes that predicted VO2max trainability were identified. Phenotype is dependent on several of these genotypes/variants, which may contribute to approximately 50% of an individual’s VO2max trainability. Higher responders to exercise training have more positive response alleles (greater gene predictor score) than lower responders. Whilst these findings are exciting, further randomized-controlled research with larger and diverse cohorts are needed. Additional exploration is required to identify genetic variants and the mediators (training intensity and volume, diet, drugs, other lifestyle factors) that can potentially affect gene expression, molecular function and training response. Findings from this review and future research may assist clinicians to provide precision evidence-based medicine centered on phenotype, contributing to the fight against chronic disease.

Pubmed, embase, cinahl and cochrane search terms

Pubmed search

gene*[ti] OR allele [tiab] OR SNP [tiab] OR genetic profiling[tiab] OR genetic variant*[tiab] OR Genomic predictor*[tiab] OR polymorphism[tiab] OR heritability[tiab] AND (exercise training [tiab] OR VO2peak[tiab] OR ‘cardiorespiratory fitness’[tiab] OR ‘maximal/maximum VO2peak’[tiab] OR maximal/maximum VO2max’[tiab] OR maximal oxygen consumption’[tiab]OR peak oxygen uptake’[tiab] OR interval exercise’[tiab] OR ‘high/low intensity exercise’[tiab] OR peak fitness [tiab] OR endurance*[tiab] OR physical fitness[tiab] OR cardiorespiratory fitness[tiab] OR endurance training [tiab] OR cardiovascular fitness[tiab] OR VO2max[tiab] OR aerobic power[tiab] OR aerobic fitness[tiab] OR exercise capacity[tiab] OR exercise training response[tiab] OR response to exercise training[tiab]) NOT animal*.

Embase

gene:ab,ti OR allele:ab,ti OR snp:ab,ti OR ‘genetic profiling’:ab,ti OR ‘genetic variant’:ab,ti OR ‘genomic predictor’:ab,ti OR heritability:ab,ti AND (vo2peak:ab,ti OR vo2max:ab,ti OR ‘cardiovascular fitness’:ab,ti OR ‘cardiorespiratory fitness’:ab,ti OR ‘aerobic power’:ab,ti OR ‘aerobic fitness’:ab,ti OR ‘exercise training response’:ab,ti OR ‘physical fitness’:ab,ti).

Cinahl

(genes OR ‘genetic variant’ OR ‘Genomic predictor’ OR polymorphism OR ‘genetic profiling’ OR ‘single nucleotide polymorphisms’ OR ‘SNPs’ heritability) AND (‘trainability’ OR’ cardiovascular fitness’ OR ‘interval exercise’ OR ‘maximum O2’ OR maximal oxygen consumption’ OR ‘peak oxygen consumption’ OR maximal aerobic capacity’ OR ‘high/low intensity exercise’ OR ‘cardiorespiratory fitness’ OR ‘aerobic power’ OR ‘response to exercise training’ OR ‘exercise capacity’ OR ‘VO2max’ OR ‘VO2peak’ OR endurance).

Cochrane database for systematic reviews

(genes OR ‘genetic variant’ OR ‘Genomic predictor’ OR polymorphism OR ‘genetic profiling’ OR ‘single nucleotide polymorphisms’ OR ‘SNPs’ OR heritability) AND (‘trainability’ OR’ cardiovascular fitness’ OR ‘interval exercise’ OR ‘maximum O2’ OR maximal oxygen consumption’ OR ‘peak oxygen consumption’ OR maximal aerobic capacity’ OR ‘high/low intensity exercise’ OR ‘cardiorespiratory fitness’ OR ‘aerobic power’ OR ‘response to exercise training’ OR ‘exercise capacity’ OR ‘VO2max’ OR ‘VO2peak’ OR endurance).

Cochrane central register of controlled trial

(genes OR ‘genetic variant’ OR ‘Genomic predictor’ OR polymorphism OR ‘genetic profiling’ OR ‘single nucleotide polymorphisms’ OR ‘SNPs’ heritability) AND (‘trainability’ OR’ cardiovascular fitness’ OR ‘cardiorespiratory fitness’ OR ‘interval exercise’ OR ‘maximum O2’ OR maximal oxygen consumption’ OR ‘peak oxygen consumption’ OR maximal aerobic capacity’ OR ‘high/low intensity exercise’ OR ‘aerobic power’ OR ‘response to exercise training’ OR ‘exercise capacity’ OR ‘VO2max’ OR ‘VO2peak’ OR endurance).