Background

The ligaments are mechanical restraints that help on the joint normal kinematics and stability [1, 2]. In this context, the anterior cruciate ligament (ACL) is the main mechanical restrictor of excessive tibia anterior translation [3, 4] and prevents knee excessive external rotation and varus-valgus stresses, particularly under bodyweight bearing [5, 6]. Hence, the ACL is crucial for the knee proper function, and its rupture leads to acute joint functional instability – anteroposterior and rotational laxity – functional symptoms, and other intra-articular structures damage [7]. Consequently, resulting in joint effusion, muscle weakness, altered movement pattern, and reduced functional performance [8].

ACL rupture is a very common and severe knee injury in sports and occur mostly due to noncontact injuries [9], being the most common mechanisms of a deceleration event together with a sudden change in direction with a planted and/or a non-compensated dynamic valgus during landing [10, 11]. It makes the ACL tears of huge concern in sports involving jumping, pivoting, and cutting maneuvers [12, 13].

The treatment after an ACL rupture is mainly surgical, and the rehabilitation process is a time-consuming effort, which can endure up to 9–12 months for complete healing and return-to-sport [14, 15]. Nevertheless, even when treated properly there are odds for a second rupture, and development of concomitant and chronic degenerative articular injuries (e.g., meniscus tears, osteoarthritis) [16]. If it was not enough, surprisingly only 65% of soccer players return to participate at the same pre-injury level 3 years after the ACL tear [17]. Altogether those pieces of evidence highlight that an ACL tear is a social, economic, and professional burden for sports teams and the affected athlete [18]. Thus, a better understanding of the extrinsic – type and level of activity, playing surface, and equipment used – and intrinsic – sex, hormonal, anatomical, neuromuscular, and genetic contribution – risks factors is mandatory for designing target injury preventive interventions [9, 19, 20].

There is an increasing amount of evidence that genetic sequence variants play an important role in ACL rupture occurrence [21, 22] and single nucleotide polymorphisms (SNPs) in the collagen genes have already been associated with a genetic susceptibility for ACL rupture [20, 23,24,25,26,27,28]. What makes sense, since the ACL is a multi-fascicular structure – a built of dense and regular connective tissue mainly of collagen (~ 75% of its dry weight) – predominantly of type I (up to 85% of ACL’s collagen) [29, 30], which is believed it to be responsible for the ligament tensile strength [6, 29].

Type I collagen consists of two α1 and one α2 chains, encoded by the genes COL1A1 (chromosome 17q21) and COL1A2 (7q21.3), respectively. SNPs on those genes can change type I collagen expression and tissue mechanical features [31] and, consequently, affect human tissue development and repair [32, 33]. For example, the COL1A1 rs1107946 SNP located in the promoter region (− 1997G > T) showed higher transcriptional activity, influencing gene expression regulation in the osteoblasts [34]. Also, SNPs in the COL1A2 gene at the coding or non-coding regions (rs412777, rs42524, and rs2621215) and their possible haplotypes can influence the type I collagen function [31] and, consequently, change the protein structure [32].

To the best of our knowledge, no studies evaluated the genetic combination of the COL1A1 and COL1A2 polymorphisms with susceptibility to ACL rupture. Thus, this study aimed to investigate the association between SNPs in both genes with sports-related ACL tears in athletes. The epidemiological, clinical, and athletic profiles were also explored to better understand the extrinsic and intrinsic ACL rupture risk factors in Brazilian athletes.

Methods

Participants

A total of 338 athletes between 18 and 45 years old from multiple sports modalities were prospectively selected for the present study. The subjects diagnosed with ACL rupture – only cases confirmed by clinical and magnetic resonance image – or who underwent an ACL reconstruction surgery previously were selected to compose the case group (N = 146). While the control group (N = 192) was composed of subjects without evidence of musculoskeletal injuries. Only subjects with complete orthopedic data and biological material collected were included in the present study. All subjects provided informed consent and the research was approved by the local Ethics Committee (#2.455.630/2017). The study was conducted in accordance with the Helsinki Declaration.

Experimental approach

This was a case-control study. The volunteers were recruited between March 2018 and July 2019 at different sports training centers, sports competitions events, and the sports trauma outpatients service at our institution. First, the athletes answered the previously validated questionnaire to assess the general, training, and injuries information [35]. Follow a non-invasive sample from the oral mucosa was collected for genotyping analyses of collagen type I genes.

Musculoskeletal injuries questionnaire

Besides the data regarding musculoskeletal injuries’ history, also was acquired clinical ACL tears-specific information as if the injury occurred during an official sports event, by contact or non-contact, report of post-training knee pain, ACL reconstruction surgery, injury recurrence, and time away from training due to the lesion, competitive practice (time and level), years of training, and weekly training hours. The data was collected with a printed version of the questionnaire “Musculoskeletal injuries report for Brazilian athletes” [35], immediately checked by the researcher who applied it with the athlete to clarify doubts and increase data reliance, and then the answers were transferred to a computational database and double-checked by different trained researchers.

Polymorphism genotyping

Athletes’ genomic DNA was collected using a swab from the oral mucosa and isolated using an extraction kit (Qiagen, Hilden, Germany). Genotyping analyses of COL1A1 (rs1107946) and COL1A2 (rs412777, rs42524, rs2621215) SNPs were performed using TaqMan allelic discrimination assays (Table 1). For all polymorphisms, real-time polymerase chain reaction (PCR) was performed on a QuantStudio™ 3 Real-Time PCR System (Thermo Fisher Scientific, Waltham, MA USA). PCR amplification was performed in 8 μL reactions with 1 μL of template DNA (3–23 ng/μl), 1× TaqMan Universal Master Mix, 1× each primer and probe assay. Thermal cycling was initiated with a first denaturation step of 10 min at 95 °C, followed by 40 cycles of denaturation at 92 °C for 15 s and annealing at 60 °C for 1 min. To assure genotyping quality, in each reaction two standardized positive controls of each polymorphism genotype were used, as previously described [36].

Table 1 Characterization of COL1A1 and COL1A2 polymorphisms and probes for genotyping by TaqMan real-time PCR

Statistical analysis

The data distribution was verified by the Shapiro-Wilk test. The case athletes were subdivided according to contact or non-contact ACL rupture. Continuous variables were reported as mean ± SD and differences between all ACL rupture cases and controls groups were checked using the student’s t-test. However, according to their distribution and clinical significance, for the analysis, continuous variables (age, years of training, and weekly training hours) were divided into quartiles. The nominal data were shown in proportions and differences between the two groups were evaluated using the Chi-squared (χ2) statistic test or Fischer exact test, when applicable. Deviations from Hardy–Weinberg equilibrium (HWE) were assessed by the goodness-of-fit χ2 test. The distribution of alleles and genotypes of COL1A1 and COL1A2 polymorphisms were derived by gene counting and the difference of the frequencies between the groups were evaluated using the χ2 test or the Fisher’s exact test, when appropriate. The haplotype patterns and linkage disequilibrium coefficients (D′: imbalance degree in module and R2: correlation degree) were inferred using Haploview (https://haploview.software.informer.com/4.2/), as previously described [37]. The association analysis between epidemiological, sports, and training characteristics as much as of the polymorphisms with ACL rupture was estimated by the odds ratio (OR) with a 95% confidence interval (95% CI). Multivariable logistic regression model by the stepwise method was performed to evaluate the possible associations and to set the final analysis model each variable was introduced considering its biological plausibility and univariate statistical significance [significance level – input: P ≤ 0.20 and output: P ≤ 0.05]. The statistically significant difference was set at P ≤ 0.05. All analyses were performed using the Statistical Package for Social Sciences (SPSS Inc., Chicago, IL, USA, version 20.0). The sample size (N = 338) was appropriated to detect differences between case and control groups, assuming an odds ratio of 2.0 with a power of 0.8 and 5% type I error (Epi Info 7, version 7.1.3., https://www.cdc.gov/epiinfo/por/pt_pc.html).

Results

We observed a total of 146 ACL rupture cases, being 67 (45.9%) due to a contact injury, 67 (45.9%) were non-contact, and 12 (8.2%) did not inform about the nature of the injury. In addition, 102 (70.8%) underwent an ACL reconstruction surgery, while 42 (29.2%) were waiting for the surgical procedure to be scheduled. Additional relevant ACL rupture-related information and its comparison between contact and non-contact injury subgroups are displayed in Table 2, and non-contact injuries were more likely (~ 2.5 times) to occur during training than contact injuries (Table 2).

Table 2 Clinical characteristics of the ACL injury-related information in the athletes

The ACL rupture cases (both contact and non-contact subgroups) were significantly older than controls (27.2 ± 6.1 vs. 23.0 ± 5.0 years old, P = 0.002), with higher sports practice time (12.4 ± 6.7 vs. 8.3 ± 5.8 years, P = 0.07), and weekly training hours (15.5 ± 9.8 vs. 12.3 ± 6.7 h, P = 0.01) (Table 3). Table 3 describes the demographic, sport, and training characteristics of the studied athletes. In summary, all variables were analyzed to identify possible confounding variables of the true association between COL1A1 and COL1A2 SNPs and the ACL rupture. Initially, for the analysis of injured group, the variables age (P < 0.001), sex (P = 0.11), sport modality (P = 0.01), training location (P = 0.05), years of training (P = 0.03) were inserted in the logistic regression model. After the stepwise regression method, the variables age, sport modality, training location, and years of training remained in this model. On the other hand, for analysis of non-contact ACL rupture, only age, sport modality, and training location were considered confounding variables. The variables age, sport modality, training location, and years of training were associated with ACL rupture when considering injured group for the analysis. Likewise, the age, sport modality, and training location remained associated with only non-contact ACL rupture in athletes (Table 3).

Table 3 Epidemiological, sport, and training characteristics of the athletes

The COL1A1 (rs1107946) and COL1A2 (rs412777, rs42524, rs2621215) SNPs were in Hardy–Weinberg equilibrium in the overall studied athletes and each group (case and control). Association analyses of the COL1A1 and COL1A2 SNPs with ACL rupture are shown in Table 4. After adjustment for confounding factors (age, sport modality, and training location) variant genotype CC (rs42524), allele and genotypes variants G and GG (rs2621215) of the COL1A2 gene were associated with increased risk of non-contact ACL rupture. However, no significant differences were detected in allele or genotype distribution of all SNPs (COL1A1 rs1107946 and COL1A2 rs412777, rs42524, rs2621215) comparing control and injured group of the ACL tear (adjustment for age, sport modality, training location, and years of training).

Table 4 Association analyses of the COL1A1 and COL1A2 polymorphisms with ACL rupture

Haplotypes of the COL1A2 gene were determined for all athletes. The results revealed that SNPs rs412777, rs42524, rs2621215 were in strong linkage disequilibrium, forming a single haploblock (Fig. 1A). Eight haplotypes could be inferred and AGT haplotype was considered wildtype/reference haplotype due to present the highest frequency in the study population (N = 320, 47.3%). However, after multivariate analysis, no significant differences were detected in haplotype frequencies between controls and injured group of ACL rupture, neither between controls, nor non-contact injury (Fig. 1B).

Fig. 1
figure 1

Haplotype analysis for COL1A2 (rs412777, rs42524, rs2621215) SNPs in Brazilian athletes. A. Number in boxes indicates decimal places of D′. B. Haplotype distributions between injured group and non-contact ACL injury compared to control group

On the other hand, a combined analysis of the four studied COL1A1 and COL1A2 SNPs was performed to investigate whether the presence of more than one SNP would increase the risk of developing the ACL rupture (Table 5). After adjustment for age, sport modality, training location, and years of training, combined variant genotype COL1A1 GT or TT (rs1107946) and the wild-type genotypes of the three COL1A2 (rs412777 - AA, rs42524 - GG, rs2621215 - TT) SNPs were negatively associated with ACL rupture compared with the reference wild-type genotypes COL1A1 (rs1107946 - GG) and COL1A2 (rs412777 - AA, rs42524 - GG, rs2621215 - TT).

Table 5 Combined analysis of the COL1A1 and COL1A2 polymorphisms and the risk of developing ACL injury

Discussion

The two main findings of this study were that COL1A2 SNPs (rs42524 and rs2621215) increased the non-contact ACL tear risk and the rs1107946 COL1A1 SNP was associated with ACL tear only when the three COL1A2 SNPs (rs412777, rs4252, and rs2621215) were wildtype. These results are in line with the previous studies suggesting that genetic contributions should be considered as an intrinsic and non-modifiable risk factor for ACL rupture susceptibility [9], and that polymorphisms in collagen genes may contribute to the development of this injury in both sexes [25, 27, 28].

It is currently accepted that ACL tears have multifactorial etiology, represent ~ 50–60% of knee injuries [38] and usually (70–84%) occur through low-energy non-contact events [39, 40]. In the present study, age, sport modality, training location, and time of practice were associated with an ACL tear in general. However, in the non-contact ACL tear subgroup, only age, sport modality, and training location remained associated with the injury. Those were already well-described risk factors associated with ACL rupture [40,41,42,43,44]. Moreover, our results reinforce the higher risks to ACL rupture according to sport modality and training location – for example, soccer (3-fold) for soccer players and grass [45].

We observed that the COL1A2 rs2621215 GG genotype was associated with non-contact ACL rupture risk (4-fold). This SNP is in intron 46 and can interact with other functional polymorphisms and affect the removal of these gene introns [46]. Thus, the presence of the variant allele G rs2621215 could produce more rigid collagen and increase the risk of ACL rupture susceptibility. Also, COL1A2 rs42524 G > C, located in exon 28, increased the risk of non-contact ACL (~ 6-fold). This coding SNP causes alanine to proline substitution at amino acid residue 459 (Ala459Pro) in the Y-position of the Gly-X-Y repeat of the collagen I α2 helical region [46, 47], and the change in protein conformation affects collagen type I stiffness, which the Pro-459 variant amino acid is more rigid than that the Ala-459 [46]. Theoretically, more rigid collagen might produce stiffer (or less compliant) ACL, consequently reducing the tissue capacity to dissipate mechanical energy and increasing the chance of failure with reduced external forces [48].

Interestingly, the combined variant genotype COL1A1 rs1107946 with wildtype genotypes of the three COL1A2 (rs412777, rs42524, rs2621215) SNPs produced a protective factor to the ACL tear. The rs1107946 located on the promoter region of the COL1A1 gene (−1997G > T) and the presence of variant allele G has a higher transcriptional activity than the wild-type allele T [25, 34]. Also, this finding agrees with Ficek et al., who did not observe any association of the COL1A1 rs1107946 with ACL rupture, except when it was combined with the COL1A1 rs1800012 SNP [25]. Sivertsen et al. analyzed six collagen genes polymorphisms (COL1A1, COL3A1, COL5A1, and COL12A1), including COL1A1 rs1107946, and did not observe an association between ACL rupture risk and selected polymorphisms in female elite athletes from Norway and Finland. However, no combined genotype analysis was conducted [49]. Here, the cumulative effect of the COL1A1 and COL1A2 SNPs suggested a gene-gene interaction that might change the collagen type 1 α1/α2 chains ratio, thus predisposing the ACL rupture.

Hence, genetic investigations may help in the understanding of non-contact ACL ruptures [40]. This was the first study investigating a genetic combination of the COL1A1 and COL1A2 polymorphisms to the susceptibility to a non-contact ACL tear in athletes. Besides the several strong points, like adequate sample size and all injured subjects were diagnosed confirmed by a gold standard medical image exam or underwent an ACL surgical reconstruction, some limitations should be kept in mind. There is always the risk of recall bias in self-reported questionaries, twelve subjects did not remember the injury nature (contact or non-contact) and were excluded from the subgroup analysis (Table 2), and genotyping failure of some samples. However, we believe those limitations did not affect the statistical outcomes.

Investigating the link between ACL rupture and genetic variants is an important subject to a better elucidation of injury etiology and risk factors. The present results can be used to build a database from different populations to identify the impacts of genetics SNPs on ACL rupture risk [50]. Therefore, the knowledge regarding possible individual predisposing factors is essential to understand the molecular mechanism of ACL rupture and improve the disease prognosis by screening and prevention protocols. Due to the loss of physical performance, sports ability of their subjects, and high treatment costs, the genetic information can assist in ACL tear susceptibility identification, as contributing to individualized training and surveillance for monitoring the at-risk individuals. However, caution should be considered when generalization those results to other populations, and further studies should be carried on this topic.

Conclusion

In summary, COL1A2 rs42524 and rs2621215 SNPs were associated with non-contact ACL risk, which represents an advance in understanding the etiology of disease. The combined analysis of COL1A1-COL1A2 genotypes suggests a gene-gene interaction in ACL rupture susceptibility. These findings suggest that polymorphism screening could identify athletes with a higher risk of non-contact ACL rupture, which could benefit from specific training and surveillance protocols to prevent lesions.