Introduction

Cancer initiation is classically associated with the induction of mutations in key oncogenes or tumor suppressor genes, due to the presence of unrepaired/misrepaired DNA lesions produced by endogenous or exogenous genotoxic agents1. Many risk factors for cancer such as smoking, ionizing radiation, and diet can induce DNA damage2. Higher levels of DNA/protein adducts in blood from exogenous exposures are associated with increased cancer risk3. DNA repair plays a fundamental role in the maintenance of genomic integrity4. Individuals with deficiency in DNA repair capacity might be more susceptible to cancer risk.

DNA repair capacity can be assessed either with genomic/proteomic approaches or with phenotypic approaches5. A concern with genomic/proteomic approaches is that mammalian DNA damage repair mechanisms are extraordinarily complex. In humans it involves ~ 450 genes in 13 different pathways including 7 core and 6 associated pathways, with over half the proteins interacting with other proteins from different pathways (Fig. 1)6; it follows that any specific genomic or proteomic methodology is unlikely to reflect overall DNA repair capacity. If it were possible to characterize the genetic complexity, it would be extremely challenging to implement at a clinical level. By contrast, phenotypic approaches—e.g., inducing DNA damage and then measuring the rate of DNA repair or the amount of unrepaired DNA damage, or both—have the potential to be more reflective of overall DNA repair capacity7. DNA repair phenotyping assays use fresh or cryopreserved peripheral blood mononuclear cells (PBMC) or lymphoblastoid cell lines as a surrogates for target tissue of DNA repair7. A phenotypic assay, if it is high throughput, may be more feasible to implement in a clinical setting as phenotypic approaches can reflect the totality of multiple complex pathways.

Figure 1
figure 1

DNA repair pathways.

The purpose of our systematic review and meta-analysis is to quantitatively and qualitatively summarize the literature regarding DNA repair phenotype and risk of cancer. We assessed the association of DNA repair phenotype biomarkers with the risk of cancer by conducting a meta-analysis from all epidemiological studies published through March 2021.

Results

Overall summary of number and study design of studies

Detailed characteristics of the included studies are shown in Supplemental Table 1. Based on the inclusion eligibility, we identified 55 studies of 12 different cancer types: lung (n = 20), breast (n = 10), skin (n = 7), head and neck (n = 7), bladder (n = 2), esophageal (n = 2), upper aerodigestive tract (n = 2), prostate (n = 1), gastric (n = 1), colorectal (n = 1), gliomas (n = 1) and liver (n = 1). All studies used a case–control study design and most used blood collected at the time of cancer diagnosis; only two studies were nested case–control studies using blood collected before cancer diagnosis8,9. The first nested-case control study was by Sigurdson et al.8 and used three DNA repair assays: comet assay, host cell reactivation assay and mutagen sensitivity assay in blood collected between 0.3 and 6 years before lung cancer8. The authors reported an OR of 2.09 (95% CI 1.00, 4.37) for lung cancer risk among individuals at the highest quartile of chromatid breaks/cell compared with individuals at the lowest quartile measured by the mutagen sensitivity assay. The ORs were 1.2 (95% CI 0.54, 2.65) for the comet assay and 0.96 (0.45, 2.04) for the host cell reactivation assay. The second nested case–control design was by Shen et al.9 and used a modified host cell reactivation assay to measured homologous recombination repair capacity in bloods collected from 152 breast cancer patients and their matched controls and reported an OR of 1.42 (95% CI 1.21, 2.52). A similar magnitude effect size was then found in the validation set of 50 cases-control pairs using blood collected before cancer diagnosis9.

The overall pooled OR (95% CI) for DNA repair deficiency and cancer risk was 2.92 (2.49, 3.43) (Fig. 2). We saw significant heterogeneity across different studies (I2 = 84.2%; p-value from Cochran’s Q < 0.0001), and the Funnel plot suggested possible publication bias (p-value from Egger’s Test < 0.0001; see Supplemental Fig. 1). We further looked by cancer type and assay to better understand the sources of heterogeneity.

Figure 2
figure 2

Forest plot of meta-analysis of lower DNA repair capacity and cancer risk in the random effect model. Individual studies are represented by ORs and 95% CI. The dashed line indicates the value of the overall pooled OR.

Cancer type

We found lower DNA repair phenotype was associated with all studied cancer types, and the pooled ORs ranged from 2.02 (1.43, 2.85) for skin cancer to 7.60 (3.26, 17.72) for liver cancer (Supplemental Fig. 2, and Fig. 3). We observed heterogeneity across skin, lung, bladder, and breast cancer studies, while there was no evidence of heterogeneity across studies for esophageal, head and neck, or upper aerodigestive tract cancers.

Figure 3
figure 3

Forest plot of meta-analysis of lower DNA repair capacity and cancer risk by cancer type in the random effect model. Individual cancers are represented by ORs and 95% CI. The dashed line indicates the value of 1.

Assay type

In our meta-analysis, there were 10 DNA repair phenotyping assays including the host-cell reactivation (n = 18), mutagen sensitivity (n = 18), comet (n = 6), radiolabeled synthetic (n = 5), γ-H2AX (n = 4), end-joining (n = 2), etoposide (ETOP)-induced double strand break (n = 1), nucleotide excision repair protein (n = 1), homologous recombination repair, (n = 1), and immunofluorescence assays (n = 1). The pooled ORs (95% CI) were 2.34 (1.75, 3.14) for the host-cell reactivation assay, 3.26 (1.75, 3.14) for the mutagen sensitivity assay, 3.21 (1.97, 5.21) for the comet assay, 5.06 (3.67, 6.99) for the γ-H2AX assay (Supplemental Fig. 3 and Fig. 4). Studies using the host-cell reactivation, mutagen sensitivity, comet, and radiolabeled synthetic assay had evidence of heterogeneity across studies.

Figure 4
figure 4

Forest plot of meta-analysis of lower DNA repair capacity and cancer risk by assay type in the random effect model. Individual assays are represented by ORs and 95% CI. The dashed line indicates the value of 1.

We further examined the association of DNA repair deficiency by assay type among lung and breast cancer studies, the- most frequent studies and common cancers. We found the effects of lower DNA repair capacity for lung cancer risk were similar across the different assay types (range of ORs = 2.14, 3.57) (Supplemental Fig. 4A). Although there was heterogeneity across studies within assay groups, we did not see statistically significant heterogeneity in the ORs for lung cancer pooled across assay groups (p = 0.21). We did observe statistically significant heterogeneity across different assays in the breast cancer studies (p = 0.01), where the host cell reactivation assay showed the largest effect size with a pooled OR of 7.75 (1.79, 33.49) (Supplemental Fig. 4B).

Discussion

The meta-analysis we conducted summarized data from 55 studies and supported the hypothesis that individuals with lower DNA repair capacity are at increased susceptibility to cancer development and this result was consistent across cancer types and specific DNA repair phenotypic assays. This finding suggests that measuring DNA repair phenotype can potentially identify high-risk individuals for effective primary prevention, and for risk-based screening options.

Accurately identifying high-risk individuals is essential for effective primary prevention (e.g., chemoprevention)10, and for risk-based screening options11 which emphasize risk rather than age for optimal screening outcomes. Cancer risk prediction models incorporating minimally invasive blood markers including genetic variants12 and epigenetic markers13 have shown modest improvement in discriminatory accuracy. The magnitude of the associations between DNA repair phenotype and cancer risk is much stronger compared with the effect size measured by genetic variants of DNA repair genes (ORs range from 1 to 2)14. Our previous study examined the association of DNA double strand break repair capacity with breast cancer risk15. We found the largest differences in the DNA repair capacity between cases and controls were observed in women younger than 40 years15. Cancer risk models incorporating DNA repair phenotypic markers may significantly improve current cancer risk prediction16. However, to potentially integrate DNA repair phenotyping data into risk assessment, more studies are needed to examine intra-individual variability in DNA repair phenotyping over time, to assess whether a single measure at the time of first breast screening is useful or whether multiple measures over time are needed.

In our meta-analysis, we found there was significant heterogeneity across studies, which might be related to different cancer types and different DNA repair phenotyping assays. A potential explanation for why we observed heterogeneity across studies of different cancer types and assays might be related to the complex interplay of genetic and environmental factors in most cancer types. Analysis of the mutation burden of 27 tumor types found that there is substantial inter-individual variation in tumor mutational burden between cancer types and within individual tumor types17. Moreover, variability of the cell-based assays related to inter-laboratory experimental protocols is a challenge for inference5, 18. Differences in the experimental conditions including dose of the DNA damage reagents, cell types and cell culture condition might contribute to the heterogeneity across different studies18. It is known that different lymphocyte subsets response to DNA damage differentially; stimulated and non-stimulated lymphocytes also behave differently after DNA damage19,20,21. To better interpret the heterogeneity across different studies, cancer types and assays, future studies should report procedures and results following OECD guidelines22. In addition, we observed a potential publication bias, suggesting studies with statistically significant effects were more likely to be published.

Most studies use PBMC as surrogate tissues, assuming that PBMC are a legitimate surrogate for DNA repair in other tissues. The correlation between DNA repair capacity between target and blood is limited to one study that found a good correlation between OGG activity in blood and lung tissues from the same individual23. Although assays using blood samples are more feasible to implement in a clinical setting, more studies are needs to evaluate the correlation of DNA repair phenotype between blood and target tissues using different assays. There are numerous methods for measuring DNA repair directly, and each has its strengths and weaknesses24. Most of the assays such as the host-cell reactivation, mutagen sensitivity, immunofluorescence assays measure nucleotide excision repair capacity25. Nucleotide excision repair eliminates a wide variety of different forms of DNA damage and especially deals with bulky DNA damage/adducts induced by chemical carcinogens and dimers induced by ultraviolet light26. Methods such as the comet, host-cell reactivation and radiolabeled synthetic assays can potentially measure different DNA repair pathways. End-joining, homologous recombination repair and γH2AX assays focus on repair of double strand breaks. In our analysis, we found the estimated effect sizes were consistent and of high magnitude across different assays and pathways. However, DNA repair functions are redundant in the context of cellular DNA damage, and there are back-up systems. If one of the critical DNA repair pathways is impaired, other pathways may be activated complicating understanding risk27.

Functional DNA repair assays are fundamentally more powerful than genotyping. But currently, there are few DNA repair assays available for epidemiologic studies because the assays are labor and time intensive. Thus studies to date are limited and there are no large-scale prospective studies or high-throughput phenotypic assays28. The resultant lack of population studies integrating these potentially informative measures with other factors limits our understanding of the fundamental cellular response to environmental exposures. However, recently our group developed a high-throughput γ-H2AX assay based on imaging flow cytometry (IFC) which is a faster and more efficient technique for assessing global double strand break repair capacity29. This IFC-based γ-H2AX protocol may provide a practical and high-throughput platform for measurements of individual global DNA double strand break repair capacity which can facilitate precision medicine by predicting individual radiosensitivity and risk of developing adverse effects related to radiotherapy treatment. The blood drop method of analysis of γH2AX is a simple and fast assay for large scale studies, screening and routine biomonitoring of exposure30. Cancer susceptibility is inherently complex, and polygenetic risk scores using genetic data have been established and show improvement in prediction accuracy for cancer31. Our meta-analysis supports a strong association between global repair capacity and cancer risk. Measuring DNA repair capacity is a potentially powerful marker to identify subgroups at high risk of cancer. Measuring overall DNA repair capacity markers in blood may be one way of understanding the role of DNA damage and repair in cancer risk and might provide intermediate outcome markers in prevention studies. Measuring DNA repair capacity may provide a potentially robust method to identify individuals that can benefit from individual-based health risk assessment and personalized risk reduction strategies. Established high-throughput measurement of DNA repair phenotyping may also be more feasible to implement in a clinic setting as opposed to complex genomic and proteomic approaches. Incorporating DNA repair phenotype into risk models may improve model discriminatory accuracy but will need large-scale prospective evidence to understand the role of timing and age at measurement and cancer screening initiation.

Materials and methods

We used the following MeSH terms in our literature search: “cancer” AND “DNA repair phenotype” OR “DNA repair capacity” OR “comet assay” OR “Host-cell reactivation” OR “γ-H2AX assay” OR “Mutagen sensitivity assay” for studies published from 1980 to 20 March 2021 (Supplemental Fig. 5). Our initial search of the PubMed database restricted to studies that were conducted in humans and published in the English language returned 2045 publications for further screening. We first reviewed the title and abstract of each study and excluded 1932 studies that (1) did not examine cancer as an outcome, (2) did not use a cellular assay for DNA damage and repair, and (3) did not compare differences in DNA damage and repair between cancer cases and unaffected controls using either case–control or cohort study designs. We then reviewed the remaining 113 studies and restricted our analysis to studies (n = 55) that estimated effect size of DNA damage and repair between cancer cases and unaffected controls. We searched the reference lists of the included publications for additional eligible publications, but no additional studies were identified. The remaining 55 publications were included in our review8, 9, 15, 32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83. We extracted data on study population, study design, sample size, DNA repair phenotyping assay, confounding assessment, and effect estimates for the group with the lowest DNA repair capacity compared with the group with the highest capacity, and the corresponding 95% confidence intervals (CIs) from the included publications. When a study reported results on different racial groups or different damage reagent, we treated each group as a separate comparison in our meta-analysis. Studies included in the current meta-analysis had to meet both of the following criteria: (1) use an epidemiological study design such as a case–control or cohort study design, and (2) present odds ratios or rate ratios.

Statistical analysis

We conducted a meta-analysis to calculate pooled estimated odds ratio (ORs) across studies using random-effects models to account for between study heterogeneity. To assess the heterogeneity among studies, we used the Cochran Q test84 and I squared (I2) statistics85. Cochran Q test is calculated as the weighted sum of squared differences between individual study effects and the pooled effect across studies. The I2 statistics describes the percentage of variation across studies that is due to heterogeneity rather than chance. We used a funnel plot86 to assess the risk of bias and examine metal-analysis validity. In the absence of bias, studies are symmetrically distributed around the fixed effect size estimate, due to sampling error being random. When bias is present, study-level effects will be asymmetrically distributed around the global fixed-effect estimate.

To examine possible publication bias, we generated funnel plots and used the Egger’s test87 to examine if there were small study effects. We also used an influence plot to evaluate if individual studies were impacting overall summary estimates. We performed subgroup analyses stratified by the tumor site and assay type. We only report results from the random-effects models, and not fixed-effects models, as we found there was significant heterogeneity across the different studies. Analyses were performed using the software Stata 15.1 (College Station, TX)88. All P-values were two-sided.