On the relationship between the heritability and the attributable fraction
- 180 Downloads
Heritability is the most commonly used measure of genetic contribution to disease outcomes. Being the fraction of the variance of latent trait liability attributable to genetic factors, heritability of binary traits is a difficult technical concept that is sometimes misinterpreted as the more-easily understandable concept of attributable fraction. In this paper we use the liability threshold model to describe the analytical relationship between heritability and attributable fraction. Towards this end, we consider a hypothetical intervention that is aimed to reduce the genetic risk of the disease for a specified target group of the population. We show how the relation between the heritability and the attributable fraction depends on the disease prevalence, the intervention effect and the size of the target group. We use two real examples to illustrate the practical implications of our theoretical results.
Measuring the genetic influence on human diseases is one of the most important topics in medicine and epidemiology. This is commonly done in terms of the heritability, routinely reported using studies of siblings and twins (Lichtenstein et al. 2000; Mucci et al. 2016). For continuous traits, heritability is defined as the proportion of phenotypic variation that can be attributed to genetic variation. This definition can also be used for binary traits; however, for binary traits it is more common to define the heritability as the proportion of variance on the latent liability scale attributed to genetic variation, which is conceptually more difficult and often misinterpreted (Visscher et al. 2008; Witte et al. 2014). For example, in attempts to assess how much of a disease burden can be attributed to genetic factors, heritability is sometimes misinterpreted as an attributable fraction (AF) (Mucci et al. 2016). The AF for a particular exposure is defined as the proportion of disease cases that would be prevented if the exposure was eliminated from the population (Levin 1953). This definition differs from the heritability in two important aspects. First, in contrast to the AF, the heritability does not measure the effect of an intervention. Second, while the AF refers to a specific and well-defined exposure, which (in principle) can be eliminated, the heritability captures, in a loose sense, the aggregated impact of variation over the whole genome on the disease.
Despite differences between the concepts, it is intuitively reasonable that the heritability conveys something meaningful about the impact of genetic interventions. At one extreme, when the heritability is equal to 1, we would expect genetic interventions to have a large potential for reducing the disease prevalence. At the other extreme, when it is zero, genetic interventions will not have any impact on the disease prevalence. However, no formal analysis of the relationship between the overall heritability and the AF exists. Previous studies have either restricted the interest to the heritability attributed to a limited set of SNPs (Wang et al. 2018) or adressed the overall heritability but lacked a general formalization of this relationship (Ramakrishnan and Thacker 2012). In this work, we derive a formal link between the overall heritability and the AF by using the liability threshold model (Falconer 1965) and an extension of the AF which allows for continuous exposures (Morgenstern and Bursic 1982; Taguri et al. 2012).
The outline of this paper is as follows. First, we review the theory behind the liability threshold model and use the model to derive the relation between the AF and the heritability. Next, we illustrate the practical implications of this relationship with two real examples: one concerns the prevention of cardiovascular events by medication, and the other a comparison between two strategies for breast cancer prevention.
The liability threshold model
The epidemiological literature distinguishes between the point prevalence (i.e. the proportion of the population that has the disease at a given point in time) and the lifetime prevalence (i.e. the proportion of the population that develops the disease at some point during life). Here, we are interested in ever experiencing the disease and we are thus interested in the lifetime prevalence.
Before proceeding, we emphasize an important feature of the liability threshold model. Because the model is additive, it is, for any given value \(E<\infty\), possible to find values of G for which the liability falls below the threshold \(\beta\). In particular, this happens when \(G=-\infty\). The model thus assumes that there exists an ‘optimal’ genetic composition that will prevent the disease from occurring, regardless of what environment the subject is exposed to. For multifactorial diseases, this assumption may be reasonable as an approximation to reality, at least when the environment is not too extreme. We note, though, that even for multifactorial diseases one can easily conceive of extreme environments that would cause the disease, regardless of the subject’s genetic composition (e.g. living inside a nuclear reactor will always cause cancer).
The heritability as a parameter in the liability threshold model
The attributable fraction
The AF measures the (net) proportion of disease cases prevented by the intervention. For instance, suppose that the factual prevalence is 5%, and that the intervention reduces the prevalence to 1%. The proportion of prevented disease cases is then equal to \((0.05-0.01)/0.05=80\%\).
In our context, the exposure of interest is the genetic risk G in the liability model. This is supposed to capture the aggregated impact of variation over the whole genome on the disease, which cannot be ‘eliminated’ in any meaningful sense. We can, however, think of hypothetical interventions that aim at manipulating the genetic risk in other ways, e.g. by changing its distribution or shifting it by a fixed constant. Thus, we use the standard definition of the AF in (5), but allow for \(p^*\) to represent the counterfactual disease prevalence under such ‘generalized’ interventions. This generalization of the AF is sometimes referred to as the generalized impact fraction (Morgenstern and Bursic 1982; Taguri et al. 2012).
In practice, interventions may be targeted towards subgroups of the population due to, for instance, considerations of cost-effectiveness. For example, even though preventive medication against high blood pressure and cholesterol have been shown to reduce the risk of cardiovascular events (Yusuf et al. 2016), it might not be possible to implement this intervention on the whole population, since those at low risk will not have a sufficiently high benefit to motivate bearing the cost of the medication. Instead, it is more efficient to target the intervention to those at the highest genetic risk of a cardiovascular event (Tada et al. 2016).
We allow for targeted interventions that only apply to subjects with particularly high genetic risks, e.g. those who have a genetic risk above a certain quantile in the genetic risk distribution. We thus define the target group as those subjects for whom \(G>b\sigma _G\), where b is a fixed constant that corresponds to a quantile in the standard normal distribution. For instance, \(b=1.64\) and \(b=1.96\) correspond to 0.95 and 0.975 quantiles, respectively, and setting b to \(-\infty\) implies that we include the whole population in the target group. In practice, a high genetic risk group could be identified by using familial risk or a genetic risk score (Yoon et al. 2002; Belsky et al. 2013; Khera et al. 2016; Tada et al. 2016).
Properties of the attributable fraction
We note that the hypothetical scenario in Fig. 1 is unrealistic, in that the disease prevalence is unusually high.
The relationship between the AF and the disease prevalence is rather intricate, as the prevalence appears in both numerator and denominator of the expression in (10). When \(p=1\) (everybody gets the disease), both terms in the numerator of (10) simplify to the univariate distribution function \(\varPhi (-b)\), so that the AF equals 0. When p approaches 0, it can be shown that the AF approaches 1 (see Appendix B). We have made an extensive grid search over k, b, \(h^2\) and p. Based on this grid search, we conjecture that the AF decreases monotonically with p; however, we have not been able to prove this analytically.
In the previous section, we have derived the theoretical relationship between the heritability and the AF, by means of a hypothetical intervention that reduces the genetic risk of a disease. In this section, we illustrate the implications of this relationship through practical examples. In particular, we show how we can use the relationship to investigate the population impact of various intervention strategies.
Example 1: blood pressure and cholesterol-lowering medication
High blood pressure and cholesterol levels are well-known risk factors for cardiovascular events such as acute myocardial infarction (AMI) and stroke (Mozaffarian et al. 2015; Khera et al. 2016; Yusuf et al. 2016). These risk factors both have strong genetic components (Weissglas-Volkov and Pajukanta 2010; van Rijn et al. 2007) and can be lowered by preventive medical treatment with statins (cholesterol lowering) and angiotensin II receptor antagonists (blood pressure lowering) (Yusuf et al. 2016).
In a Swedish study, the heritability of AMI was estimated to be 36% and the prevalence of AMI in this cohort was approximately 6% (Zdravkovic et al. 2007). In a Danish study, the heritability for stroke was estimated to be 17% and the estimated prevalence of stroke in this cohort was around 4% (Bak et al. 2002).
We will now use these examples to investigate how the differences in heritability and prevalence between AMI and stroke impact the AF. We compare how the population impact differs between the diseases depending on if the intervention is given to those at 1% or 5% highest genetic risk of AMI and stroke, i.e. \(b=2.33\) or \(b=1.65,\) respectively. The target groups could possibly be identified by a genetic risk score developed for cardiovascular disease (Thanassoulis et al. 2012).
If the intervention is given to 5% at the highest genetic risk, we suppose that this intervention can reduce the prevalence by 1.1 percentage points (i.e. from 6% to 4.9%) for AMI and 0.4 percentage points (i.e. from 4% to 3.6%) for stroke. Such a genetic risk reduction corresponds to an intervention effect of \(k=1\) for both diseases.
The examples of AMI and stroke are marked in Figs. 2 and 3. If the intervention is given to 1% of the population, the AF is 4.5% for AMI and 2.9% for stroke. When the intervention is given to 5% of the population, the AF is 17.9% for AMI and 10.9% for stroke. From these two examples we observe how an intervention with larger coverage may drastically increase the population impact of the intervention. We note, however, that the benefit of increasing the target group eventually levels off, as the intervention also covers subjects with small genetic risk who do not benefit from the intervention. In general, we also observe that the smaller the prevalence, the larger is the AF for a fixed heritability, target group and intervention effect. These, and other features of the relationships between the heritability, intervention effect, target group size and disease prevalence can be investigated using our Shiny app ‘afheritability’ (Dahlqwist et al. 2018).
Example 2: prevention of breast cancer
Studies have shown that BPM almost eliminates the risk of breast cancer (Rebbeck et al. 2004). We thus assume that the intervention effect of the BPM intervention almost eliminates breast cancer within the target group, e.g. \(k=10\). Moreover, preventive treatment with tamoxifen has been shown to almost halve the cumulative rate of invasive breast cancer (Fisher et al. 2005). Thus, based on these studies we assume that preventive tamoxifen treatment can reduce the lifetime prevalence within the target group by around 50%, which approximately corresponds to \(k=1\).
The target groups for these interventions can be chosen based on a genetic risk score for breast cancer (Shieh et al. 2016). We assume that BPM will only be given to women at the highest 1% genetic risk of breast cancer, i.e. \(b=2.33\). Tamoxifen treatment is not as invasive as the BPM intervention. However, due to the adverse effects of tamoxifen (Fisher et al. 1994; van Leeuwen et al. 1994), it is only recommended to women with a high genetic risk of breast cancer (Moyer and US Preventive Services Task Force 2013). Therefore we assume that that this intervention can be given to, at most, women at the highest 5% genetic risk of breast cancer, i.e. \(b=1.64\). We compare the AF for tamoxifen given to those at the 1% versus 5% highest genetic risk.
Figure 4 illustrates the AF as a function of the intervention effect k, for heritability 31% and target group sizes equal to 1% and 5%. We observe that the AF is 6.7% for the BPM intervention given to the 1% at the highest genetic risk, 3.1% for the tamoxifen intervention given to the 1% at the highest genetic risk and 13% for the tamoxifen intervention given to the 5% at the highest genetic risk. Thus, even though BPM almost eliminates breast cancer within the target group, it has a smaller impact than the less efficient tamoxifen intervention given to 5%, but larger impact than tamoxife intervention given to 1%. This example illustrates that a large effect of a prevention strategy may not have a large population impact if the intervention is limited to a small part of those at risk.
Heritability is a central concept in genetic epidemiology. Yet, because it is defined in terms of proportion of variance of latent disease liability, it is difficult to interpret the population implications of a particular value of the heritability (Witte et al. 2014). Attempts to do so sometimes interprets heritability as an attributable fraction (AF) (Mucci et al. 2016). In this article, we have shown how the relationship between the heritability and the AF can be formalized and how these results can be used to understand the population implications of a particular value of the heritability.
The relationship between the heritability and the AF is rather intricate, since it depends on several parameters in a non-linear way. Both the disease prevalence and the effect of the genetic intervention modify the impact of the heritability on the AF. In reality, interventions may not always be targeted to the whole population. We have accounted for this by adding the possibility to consider situations where the intervention is only targeted at those at the highest genetic risk. Intuitively, one would expect that the AF increases monotonically with the heritability. However, we have shown by examples that this is not necessarily the case. In particular, we have shown that, if the prevalence is high and the target group is small, then the AF may increase with the heritability up to a certain point, after which it will start to decrease.
Two examples have been used to illustrate how our results can be used to understand the relationship between the heritability and the AF for different scenarios. The first example is an intervention with blood pressure and cholesterol-lowering medication to prevent AMI and stroke. In this example we have considered the target group, b, and the intervention effect, k, to be fixed and we have compared the AF separately for AMI and stroke. For the same intervention, the AF is larger for AMI compared to stroke due to the larger heritability of AMI. In the second example, we have compared two interventions to prevent breast cancer, bilateral prophylactic mastectomy and tamoxifen with a fixed heritability, \(h^2\), and prevalence, p. Bilateral prophylactic mastectomy (BPM) has a large intervention effect, but can only be given to a limited target group. Compared to BPM, tamoxifen has a smaller intervention effect but can be given to a larger target group. In this example, we observed how the AF is larger for the tamoxifen intervention compared to the BPM intervention despite its lower intervention effect since the target group is larger.
We are not the first to use the AF to measure the effect of genes. However, virtually all applications that we are aware of define the exposure as a single SNP/gene (Claus et al. 1996; Witte et al. 2014; Khoury et al. 2005) or a limited set of SNPs (Wang et al. 2018), thus measuring the effect of that single SNP and not the aggregated impact of the whole genome. A notable exception is Ramakrishnan and Thacker (2012), who used the AF for twin data and defined the exposure for a given twin as the disease status in the co-twin. The authors claimed that, by using this exposure definition, their attributable fraction would measure the proportion of disease cases that are ‘due to heritability’. However, they did not formally motivate this claim, and it is not obvious whether the term ‘due to heritability’ is empirically meaningful.
To derive the relation between the heritability and the AF, we have used the liability threshold model, which relies on several important assumptions. Specifically, it assumes that the genetic and environmental risks for the disease are normally distributed, the effect of genes and environment add up to the liability (i.e. that there is no additive statistical interaction between the risk factors) and the genetic and environmental risks are independent. Regarding the first assumption, it is reasonable to assume that the genetic and environmental risks are normally distributed, since we are considering complex traits that depend on the accumulated small contributions from many different genetic and environmental factors. The second assumption of mainly additive effects from the genetic and environmental factors has been debated to a great extent (Hill et al. 2008). However, there is not much evidence of statistical interaction effects between genes and environment (Hill et al. 2008; Hunter 2005). The third assumption of no gene–environment correlation is violated if genes affect the environment or vice versa. Gene–environment correlation might occur due to various reasons (Jaffee and Price 2008) and should be carefully considered in each particular application.
In this article, we have conceptualized ‘genetic interventions’ as all interventions that modify the genetic risk. These can be pure genetic interventions, such as gene therapy, or interventions that target the mechanisms by which the genes exert their effect, such as bilateral prophylactic mastectomy. Thus, by ‘genetic’ we do not necessarily mean that the genes are modified per se, but rather that the intervention modifies the circumstances that allow the genetic variants to manifest.
Throughout, we have defined the target group as those at the highest genetic risk. As suggested by one of the reviewers, target groups may in practice also be defined in terms of environmental risk for disease. In Appendix C we derive the relation between the AF and the heritability, when the target groups are defined as those subjects for which the total (genetic and environmental) liability exceeds a certain threshold. We show that, in fact, this alternative definition of the target group leads to simpler calculations, which only involves univariate normal distribution functions.
We would like acknowledge the biostatistics group at MEB, Zoltán Kutalik, Ralf Kuja-Halkola and Patrick Sullivan for valuable discussions and comments on this work. We acknowledge the financial support from the Swedish Research Council through the Swedish Initiative for Research on Microdata in the Social and Medical Sciences (SIMSAM) framework grant no. 340-2013-5867.
Compliance with ethical standards
Conflict of interest
The authors declare that there is no conflict of interest.
- Dahlqwist E, Magnusson P, Pawitan Y, Sjölander A (2018) Afheritability: a tool to visualize the relationship between the attributable fraction and the heritability. https://afheritability.shinyapps.io/afheritability/. Accessed 19 Jan 2018
- Falconer DS, Mackay TFC (1996) Introduction to quantitative genetics, 4th edn. Longman, LondonGoogle Scholar
- Fisher B, Costantino JP, Redmond CK, Fisher ER, Wickerham DL, Cronin WM (1994) Endometrial cancer in Tamoxifen-treated breast cancer patients: findings from the National Surgical Adjuvant Breast and Bowel Project (NSABP) B-14. JNCI: J Nat Cancer Inst 86(7):527–537Google Scholar
- Fisher B, Costantino JP, Wickerham DL, Cecchini RS, Cronin WM, Robidoux A, Bevers TB, Kavanah MT, Atkins JN, Margolese RG, Runowicz CD, James JM, Ford LG, Wolmark N (2005) Tamoxifen for the prevention of breast cancer: current status of the national surgical adjuvant breast and bowel project P-1 study. JNCI: J Nat Cancer Inst 97(22):1652–1662Google Scholar
- Hill WG, Goddard ME, Visscher PM (2008) Data and theory point to mainly additive genetic variance for complex traits. PLOS Genet 4(2):e1000008Google Scholar
- Khera AV, Emdin CA, Drake I, Natarajan P, Bick AG, Cook NR, Chasman DI, Baber U, Mehran R, Rader DJ, Fuster V, Boerwinkle E, Melander O, Orho-Melander M, Ridker PM, Kathiresan S (2016) Genetic risk, adherence to a healthy lifestyle, and coronary disease. N Engl J Med 375(24):2349–2358CrossRefGoogle Scholar
- van Leeuwen FE, van den Belt-Dusebout AW, van Leeuwen FE, Benraadt J, Diepenhorst FW, van Tinteren H, Coebergh JWW, Kiemeney LALM, Gimbrre CHF, Otter R, Schouten LJ, Damhuis RAM, Benraadt J, Bontenbal M (1994) Risk of endometrial cancer after tamoxifen treatment of breast cancer. The Lancet 343(8895):448–452CrossRefGoogle Scholar
- Levin ML (1953) The occurrence of lung cancer in man. Acta-Unio Internationalis Contra Cancrum 9(3):531–541Google Scholar
- Möller S, Mucci LA, Harris JR, Scheike T, Holst K, Halekoh U, Adami HO, Czene K, Christensen K, Holm NV, Pukkala E, Skytthe A, Kaprio J, Hjelmborg JB (2016) The heritability of breast cancer among women in the Nordic twin study of cancer. Cancer Epidemiol Biomark Prev Publ Am Assoc Cancer Res Cosponsored Am Soc Prev Oncol 25(1):145–150CrossRefGoogle Scholar
- Moyer VA, US Preventive Services Task Force (2013) Medications to decrease the risk for breast cancer in women: recommendations from the US Preventive Services Task Force recommendation statement. Ann Intern Med 159(10):698–708Google Scholar
- Moyer VA, US Preventive Services Task Force (2014) Risk assessment, genetic counseling, and genetic testing for BRCA-related cancer in women: US Preventive Services Task Force recommendation statement. Ann Internal Med 160(4):271–281Google Scholar
- Mozaffarian D, Benjamin EJ, Go AS, Arnett DK, Blaha MJ, Cushman M, de Ferranti S, Desprs JP, Fullerton HJ, Howard VJ, Huffman MD, Judd SE, Kissela BM, Lackland DT, Lichtman JH, Lisabeth LD, Liu S, Mackey RH, Matchar DB, McGuire DK, Mohler ER, Moy CS, Muntner P, Mussolino ME, Nasir K, Neumar RW, Nichol G, Palaniappan L, Pandey DK, Reeves MJ, Rodriguez CJ, Sorlie PD, Stein J, Towfighi A, Turan TN, Virani SS, Willey JZ, Woo D, Yeh RW, Turner MB, American Heart Association Statistics Committee and Stroke Statistics Subcommittee (2015) Heart disease and stroke statistics-2015 update: a report from the American Heart Association. Circulation 131(4):e29–322Google Scholar
- Mucci LA, Hjelmborg JB, Harris JR, Czene K, Havelick DJ, Scheike T, Graff RE, Holst K, Möller S, Unger RH, McIntosh C, Nuttall E, Brandt I, Penney KL, Hartman M, Kraft P, Parmigiani G, Christensen K, Koskenvuo M, Holm NV, Heikkilä K, Pukkala E, Skytthe A, Adami HO, Kaprio J, Nordic Twin Study of Cancer (NorTwinCan) Collaboration (2016) familial risk and heritability of cancer among twins in Nordic countries. JAMA 315(1):68–76Google Scholar
- Nichols HB, DeRoo LA, Scharf DR, Sandler DP (2015) Risk-benefit profiles of women using tamoxifen for chemoprevention. JNCI: J Nat Cancer Inst 107(1). https://doi.org/10.1093/jnci/dju354
- Ramakrishnan V, Thacker LR (2012) Population attributable fraction as a measure of heritability in dichotomous twin data. Commun Stat: Simul Comput 41(3):405–418Google Scholar
- Rebbeck TR, Friebel T, Lynch HT, Neuhausen SL, van ’t Veer L, Garber JE, Evans GR, Narod SA, Isaacs C, Matloff E, Daly MB, Olopade OI, Weber BL (2004) Bilateral prophylactic mastectomy reduces breast cancer risk in BRCA1 and BRCA2 mutation carriers: the PROSE study group. J Clin Oncol Off J Am Soc Clin Oncol 22(6):1055–1062CrossRefGoogle Scholar
- van Rijn MJE, Schut AFC, Aulchenko YS, Deinum J, Sayed-Tabatabaei FA, Yazdanpanah M, Isaacs A, Axenovich TI, Zorkoltseva IV, Zillikens MC, Pols HAP, Witteman JCM, Oostra BA, van Duijn CM (2007) Heritability of blood pressure traits and the genetic contribution to blood pressure variance explained by four blood-pressure-related genes. J Hypertens 25(3):565–570CrossRefGoogle Scholar
- Thanassoulis G, Peloso GM, Pencina MJ, Hoffmann U, Fox CS, Cupples LA, Levy D, D’Agostino RB, Hwang SJ, O’Donnell CJ (2012) A genetic risk score is associated with incident cardiovascular disease and coronary artery calcium—the Framingham heart study. Circ: Cardiovasc Genet p CIRCGENETICS.111.961342Google Scholar
- Yusuf S, Lonn E, Pais P, Bosch J, López-Jaramillo P, Zhu J, Xavier D, Avezum A, Leiter LA, Piegas LS, Parkhomenko A, Keltai M, Keltai K, Sliwa K, Chazova I, Peters RJG, Held C, Yusoff K, Lewis BS, Jansky P, Khunti K, Toff WD, Reid CM, Varigos J, Accini JL, McKelvie R, Pogue J, Jung H, Liu L, Diaz R, Dans A, Dagenais G, Investigators H (2016) Blood-pressure and cholesterol lowering in persons without cardiovascular disease. N Engl J Med 374(21):2032–2043CrossRefGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.