Introduction

Smoking is a fast and efficient form of drug use, exposing smokers to multiple harmful components in tobacco. These cause the well-known consequences of smoking, which are extensive and cause disease in almost every organ and part of the body [1]. The largest public health burden arises from cancers (for example lung cancer), cardiovascular disease, chronic obstructive pulmonary disease (COPD), and a variety of mental disorders. Nicotine is the main addictive component in tobacco. Cigarette smokers control their nicotine levels via cigarette consumption, number and volume of puffs, and depth of inhalation [2]. During smoking, nicotine is distilled from burning tobacco, and when inhaled, is carried on tar droplets to the lungs. In the small airways and alveoli of the lung nicotine is rapidly absorbed, and is then distributed via the bloodstream, reaching the brain in 10–20 s [2, 3]. Nicotine binds to brain tissue with high affinity, especially in regular smokers whose binding capacity is increased as a result of functional up-regulation of nicotinic acetylcholine receptors (nAChRs) [4]. The rapid delivery of nicotine enables the smoker to titrate the dose to achieve the desired pharmacological effect, which further reinforces drug self-administration and facilitates the development of addiction [2].

Intake of nicotine, its central nervous system effects, and its metabolism are regulated by biological pathways; some of these are well known, but others are not. Genetic studies offer possible insights into the genes contributing to those pathways, via studying genetic variation by using different study designs. Family and twin studies have revealed a high degree of heritability of smoking and nicotine dependence [5, 6]. In recent years, large genome-wide association study (GWAS) meta-analyses have revealed that the strongest genetic contribution for smoking-related traits comes from variation in the nAChR subunit genes [7••, 8••, 9••], but many others have been implicated by candidate gene studies, small scale GWAS, and extensive animal studies. This review intends to cover progress made in several specific areas of the genetics and genetic epidemiology of smoking behavior and nicotine dependence in recent years.

Genetic Findings Regarding Nicotinic Acetylcholine Receptors

In the brain, nicotine binds to nAChRs, which are ligand-gated cation channels that normally bind endogenous acetylcholine [10]. The binding of nicotine at the receptor binding-site induces the release of a variety of neurotransmitter molecules, for example dopamine, serotonin, glutamate, and gamma aminobutyric acid (GABA) [11, 12]. Dopamine controls the reward pathway, and is thus a major contributor to the development of addiction [13]. Nicotine also increases the release of glutamate [14], which is believed to be involved in learning and memory by enhancing synaptic plasticity [11, 15]. Consequently, the pleasurable experience of smoking is created by learning and memory, reinforcing the addictive effects of nicotine. As described above, the inhalation of smoke particles from tobacco initiates a complex set of modifications in signaling cascades that probably have a strong pharmacological contribution to nicotine addiction.

Nicotinic receptors are broadly classified as muscle-type and neuronal-type on the basis of their primary expression sites. The mammalian nervous system is known to express twelve neuronal subunits, nine alpha (α2–α10) and three beta subunits (β2–β4), and five muscular subunits (α1, β1, γ, δ, and ε). Nicotinic receptors are pentameric structures [16], and different combinations of subunits result in different receptor subtypes that vary in pharmacological properties, for example binding affinity, and distribution in the nervous system. The most widely expressed nAChR subtype in the human brain is composed of alpha4 and beta2 subunits, and it has a central function in the mediation of physiological effects of nicotine [17].

In 2010, large GWAS meta-analyses convincingly confirmed that the strongest genetic contribution to smoking-related traits comes from variation in the nAChR subunit genes [7••, 8••, 9••], as first revealed on a genome-wide significant level by Thorgeirsson et al. [18] in a study of over 13000 smokers from Iceland. The CHRNA5-CHRNA3-CHRNB4 gene cluster on chromosome 15q25.1, encoding the alpha5, alpha3, and beta4 subunits, has provided the most prominent genetic evidence; first regarding amount smoked (cigarettes per day, CPD), and subsequently regarding other smoking-related phenotypes. Most GWAS meta-analyses have been conducted on populations of European ancestry. The signal from the 15q25.1 region has also been identified in African Americans [19], but not in the Asian population [20].

The 15q25.1 locus contains a dense set of highly correlated single nucleotide polymorphisms (SNPs), and further examination of the region has revealed at least two distinct loci that contribute to heaviness of smoking [21•]. The most well-established locus within the 15q25.1 region is tagged by the functional SNP rs16969968 that causes an amino acid change (D398N) in the alpha5 subunit [22], and has been revealed to contribute to increased nicotine consumption by reducing the ability of (α4β2)α5 nAChRs to induce a normal inhibitory motivational signal intended to limit nicotine intake [23•]. The variant has a remarkably similar effect in all studied samples, even though the minor allele frequencies vary between populations [24, 25]. In a recent replication study of a large and homogenous Finnish population sample, the estimated effect size for rs16969968 was 1.39 (odds ratio) for heavy smoking (CPD > 20) vs. light smoking (CPD ≤ 10) [26]. The effect size of this SNP for continuous CPD was approximately one CPD for each minor allele, in agreement with the original GWAS report by Thorgeirsson et al. [18].

Several independent replications regarding a variety of smoking-related traits and diseases, including nicotine dependence, smoking amount, age of initiation, lung cancer, and COPD, have been reported on either rs16969968 or other highly correlated polymorphisms (e.g. rs1051730 located on CHRNA3) [2733]. In addition, an age-associated relationship underlying the detected associations has been suggested [34•], with a stronger association in early-onset smokers.

Further investigation of this robust smoking-quantity-associated region has revealed novel findings relating to other phenotypes. For instance, association was detected between alcohol use and rs588765, an SNP believed to tag a third distinct locus contributing to smoking behavior [21•], in a large Finnish population-based sample [26]. The results provided a new direction for research on the CHRNA5-CHRNA3-CHRNB4 gene cluster, and suggested that the effects of alcohol may be partially mediated via cholinergic receptors. Earlier, there were reports linking this variant to alcohol and cocaine dependence [35]. Another association was detected, in a large Norwegian sample, between the functional SNP rs16969968 and use of snus [28]. Snus is a moist, smokeless, nicotine-containing tobacco product, which delivers high quantities of nicotine, and has a very similar addiction potential to cigarettes [36].

Variation in another nAChR gene cluster on chromosome 8p11.21, which contains genes encoding the alpha6 and beta3 subunits (CHRNA6, CHRNB3), has genome-wide significant association with CPD [9••]. However, the strength of the association is modest compared with those obtained for the CHRNA5-CHRNA3-CHRNB4 gene cluster. Independent studies have provided further evidence of association between CHRNB3 and nicotine dependence, measured by the Fagerström test for nicotine dependence (FTND) [37], in multiple ethnic populations [38, 39].

Genome-wide association studies have provided a powerful tool for studying common variants in large samples. Despite the potential function of each nAChR gene in smoking behavior, GWAS have identified only a few subunits (α5, α3, β4, α6, and β3). A meta-analysis of 15 genome-wide linkage scans yielded a genome-wide significant linkage signal at 20q13.12-q13.32, a locus that contains CHRNA4, the gene encoding nAChR subunit alpha4 [40•]. Re-sequencing of CHRNA4 and CHRNB4 has disclosed rare variants affecting inter-individual differences of nicotine dependence [41, 42]. For both genes, rare non-synonymous variants with protective effects against nicotine dependence were detected. Next-generation sequencing will probably enable discovery of more rare variants, explaining differences in smoking behavior and predisposition to smoking-related diseases.

Before the era of GWAS, candidate gene studies implied associations between smoking behavior and several genes, e.g. within the nicotinergic and dopaminergic pathways [43]. However, most of the findings lack consistent replication, which suggests either high type-I error (false positive signals) in these studies, inadequate power to detect genetic loci with small effect sizes, or population specificity of the detected associations.

Neuregulin Signaling Pathway

Many of the gene systems implicated in smoking behavior have pleiotropic effects across a variety of substance dependencies and other established co-morbidities of smoking, for example depression and schizophrenia, suggesting shared underlying pathophysiology. For example, variants in nAChR genes have been associated not only with smoking quantity and nicotine dependence, but also with alcohol and cocaine dependence [35]. Detected associations between schizophrenia and variants in CHRNA3 [44], CHRNA5 [45], and CHRNA7 [46] suggest a function for nAChRs in schizophrenia predisposition. In a recent GWAS combining five psychiatric disorders (schizophrenia, bipolar disorder, major depressive disorder, autism spectrum disorders, and attention deficit hyperactivity disorder), two calcium channel subunits, CACNA1C and CACNB2, were identified [47]. It is plausible that genetic variation in basic systems may increase general susceptibility to neuropsychiatric disorders, and some combination of other genetic and non-genetic risk factors channels this risk into the development of specific disorders.

The co-morbidity between schizophrenia and nicotine dependence is well established, yet the underlying shared etiology is largely unknown. Most patients with schizophrenia smoke, and up to 75 % of them are nicotine dependent [48]. Recently, the neuregulin signaling pathway (NSP) was suggested as a novel component of the shared genetic underpinnings of schizophrenia and nicotine dependence [49•, 50•]. Neuregulins are a family of signaling molecules that bind to receptor tyrosine kinases of the ErbB family to modulate neuronal migration and differentiation [51]. The ErbB4 receptor has an important function in regulating neurite outgrowth, axonal guidance, and synaptic signaling and plasticity [52]. The NSP consists of gene products encoded by at least 10 distinct genes. Three of those genes, Neuregulin 1 (NRG1), Neuregulin 3 (NRG3), and V-Erb-A Erythroblastic Leukemia Viral Oncogene Homolog 4 (Avian) (ERBB4), have been associated with schizophrenia predisposition and symptomatology [5359]. Evidence from mouse models supports a function for at least two additional members, BACE1 and APH1B, in schizophrenia [60, 61].

A recent GWAS on Finnish twins revealed convergent evidence for an association between DSM-IV (Diagnostic and statistical manual of mental disorders, 4th edition) [62] nicotine dependence and ERBB4, suggesting involvement of the NSP in nicotine dependence [49•]. This finding was supported by behavioral mouse models revealing abolishment of withdrawal-induced anxiety both in mice with a knock-down mutation in Nrg3 and in mice treated with an ErbB4 inhibitor, suggesting that the NSP is essential to the anxiety effects of nicotine withdrawal [50•]. NRG3 variants were revealed to be associated with smoking cessation success in a clinical trial [50•]. Because it has been suggested that multiple variants aggregating in the NSP may be necessary to predispose patients to schizophrenia [63], scrutiny of other members within the NSP may reveal a further shared genetic predisposition for schizophrenia and nicotine dependence.

Nicotine Metabolism

Nicotine metabolism involves multiple steps and several enzymatic pathways [2]. Only approximately 10 % of absorbed nicotine is excreted to urine unchanged. Up to 80 % of nicotine is converted to cotinine in a two-step process: the first step is mediated by the cytochrome P450 system, mainly by CYP2A6 (cytochrome P450, family 2, subfamily A, polypeptide 6); and the second step is catalyzed by a cytoplasmic aldehyde oxidase [2]. Cotinine is further metabolized into a variety of compounds, by far the most prominent being the conversion of cotinine to 3-hydroxycotinine, performed exclusively by CYP2A6. The remaining 10 % of nicotine is metabolized through oxidation, glucuronidation, or methylation before excretion [2].

Inter-individual rates of nicotine metabolism vary substantially. CYP2A6 encodes the main metabolic enzyme for nicotine [64], accounting for approximately 80 % of hepatic nicotine oxidation. To date, over 60 distinct CYP2A6 alleles have been identified (http://www.cypalleles.ki.se/cyp2a6.htm), including SNPs, duplications, deletions, and conversions. CYP2A6 alleles have been phenotypically grouped as slow, intermediate, and normal metabolizers, with significant differences in allele frequencies among ethnic groups [65]. Individuals who carry null or reduced activity CYP2A6 alleles are more probably non-smokers, smoke fewer cigarettes per day, are less likely to progress to nicotine dependence, are less dependent on nicotine, may have an easier time quitting smoking, and have a lower risk of lung cancer [66, 67].

Twin studies suggest an important genetic contribution to total clearance of nicotine (nicotine → cotinine → 3-hydroxycotinine), with heritability estimates of 50–68 % [68]. Cotinine is a relatively stable compound, with a half-life of 15–20 h [69], and can be used to distinguish smokers from non-smokers and as a biomarker of nicotine intake. In individuals with slower CYP2A6 activity, however, cotinine accumulates [70]; therefore, cotinine levels may not reliably reflect tobacco exposure. The ratio of 3-hydroxycotinine/cotinine, referred to as the nicotine metabolite ratio (NMR), reflects both genetic variation in nicotine-metabolizing enzymes and environmental factors (e.g. estrogen levels and body mass index) [69], and can be used as a proxy for the rate of nicotine clearance [71]. The NMR remains fairly constant over time in regular smokers, and is not dependent on the time of last nicotine dosing.

CYP2B6 (cytochrome P450, family 2, subfamily B, polypeptide 6) is the second most active P450 enzyme involved in nicotine oxidation, and has approximately 10 % of the catalytic efficiency of CYP2A6. Whereas CYP2A6 is expressed primarily in the liver, CYP2B6 is expressed at higher levels in the brain, possibly accounting for localized metabolism of nicotine in the brains of human smokers [72]. At least two additional P450 enzymes, CYP2D6 and CYP2E1, have some activity toward nicotine [2].

Cytochrome P450 drug-metabolizing enzymes are rarely positive in GWAS, because the allele frequencies of the functional variants are low in most populations [2]. However, a recent very large GWAS meta-analysis revealed associations between smoking quantity and variants in CYP2A6 and CYP2B6 [9••]. The associating variant in CYP2A6 is in linkage disequilibrium with the reduced-activity allele [9••]. However, when disentangling the GWAS results, genetic complexities should be considered. Recently, a variant in EGLN2 (Homo sapiens egl nine homolog 2 (C. elegans), a gene located adjacent to CYP2A6 on chromosome 19q13, was revealed to independently associate with CPD and breath carbon monoxide, a phenotype associated with cigarette consumption and relevant to hypoxia [73]. It is plausible that genes within the 19q13 locus other than CYP2A6 also affect smoking behavior, via mechanisms unrelated to nicotine metabolism. Combined effects of CYP2A6 and CHRNA5 on smoking behavior have been reported [74, 75]. Within most Caucasian populations, with up to 90 % of individuals being fast metabolizers, a small fraction of variance in inter-individual differences in nicotine metabolism is accounted for by known allelic variants affecting CYP2A6 activity. Other contributing factors, for example regulators of CYP2A6 action, functional variants in other genes, gene–gene interactions, and gene–environment interactions, are bound to exist.

Major challenges in the genetic studies of smoking behavior include phenotype heterogeneity and lack of consistent outcome measurements. It is probable that the phenotypic definitions used to date, for example self-reported cigarettes per day, do not accurately reflect nicotine intake, as revealed by the finding that the CHRNA5 variant accounts for five times more variance in cotinine levels than in smoking quantity [76, 77]. In an interim GWAS meta-analysis of cotinine levels, a sample of approximately 2000 individuals revealed association with the CHRNA5-A3-B4 gene cluster [78], with P-values exceeding genome-wide significance, and effect sizes comparable to those obtained for smoking quantity by a large (N = 74053) GWAS meta-analysis [7••]. Consideration of phenotype quality and precision is proving more beneficial than recruiting increasing numbers of subjects with crude phenotypes [77], revealing the utility of measurable biomarkers in the genetic analysis of smoking behavior.

Gene–Environment Interactions

It is important to recognize that genes alone do not determine phenotypes: environmental factors can significantly regulate the expression of an individual’s genetic predisposition [6]. Twin and family studies have also been used to define the function of non-genetic factors, which is possible through the control of confounding genetic factors in genetically informative data sets. The standard models used to estimate heritability from twin and family studies and to evaluate findings from GWAS assume that the effects of genes and environments are independent of each other. However, heritability may be overestimated if gene–environment (G × E) interactions are present. Despite expectations of substantial G × E correlations and interactions, caused by influences on smoking initiation from environmental factors shared within families and from extra-familial environmental factors shared with peers and birth cohorts, such effects have not been widely studied. In a G × E interaction study conducted on the Finnish FinnTwin12 cohort, genetic effects on adolescent smoking decreased and common environmental influences increased at higher levels of parental monitoring [79]. Similarly, religiousness has been reported to significantly attenuate the effect of genetic variance on smoking initiation [80]. Environment interaction analyses of CHRNA5 have revealed that both parental monitoring and peer influence (smoking) modify the association between nicotine dependence and rs16969968 [81, 82]. Furthermore, the rs16969968 risk allele contributes a stronger genetic risk of heavy smoking in early-onset smokers (age at onset ≤16 years) than in late-onset smokers [34•]. More studies of G × E interactions may help determine why heritability estimates for smoking behaviors vary so greatly and why the search for predisposing genes has not been particularly successful to date. Some G × E interactions probably reflect epigenetic mechanisms, for example DNA methylation. Epigenetic processes react to external factors, and therefore provide a crucial mechanism by which environment can affect gene expression and hence phenotype. This is a topic of active research, with major breakthroughs expected in the near future; however, findings in epigenetics are outside the scope of this chapter.

Conclusions

The progress made in recent years in understanding the genetic epidemiology of non-communicable diseases, and of many normal human and animal traits, has deepened our understanding of the underlying intricate genetic architecture, while providing insights into the interplay of genes and environment in normal and abnormal development of organisms. Projected deaths from tobacco-induced diseases are estimated to reach hundreds of millions this century unless prevention and treatment can be made much more effective. Using genetics to improve our knowledge of the neurobiology and neuropathology of nicotine and other tobacco components is essential in building the knowledge base necessary for action.

Mendelian randomization (MR) can be used to scrutinize the multiple reported associations of smoking with a variety of diseases [83]. Although many are causal, resulting from the extreme toxicity of cigarettes and other tobacco products, some may result from confounding. The functional variant D398N (rs16969968) in CHRNA5 has a strong effect on smoking behavior and thus fulfils one of the necessary prerequisites for MR analyses. An ongoing large-scale MR meta-analysis conducted by the CARTA (Causal Analysis Research in Tobacco and Alcohol, PI professor Marcus Munafò, University of Bristol, UK) consortium targets rs16969968 to detect evidence for causal effects of smoking quantity on several independent outcomes, including smoking cessation, obesity and regional adiposity, income, vitamin D levels, lipids, blood pressure, and depression (http://www.bris.ac.uk/expsych/research/brain/targ/research/interests/).

The search for more genes continues, and is empowered by increasing GWAS sample sizes, by extending analyses to exome and whole genome sequence data, and by improving phenotypes. The D398N variant accounts for only approximately 1 % of variation in CPD, but almost 5 % of variation in cotinine levels, as reviewed above [76, 77]. Thus, it should be easier to detect smoking-related genetic effects by using a biomarker of exposure (cotinine) rather than a crude measure of intake (CPD). On the basis of this assumption, a cotinine GWAS meta-analysis is in progress and results will be reported in 2014. At the moment new cohorts are not being accepted for inclusion, but for more information please contact Dr Jennifer Ware, University of Bristol. For dependence phenotypes, a GWAS consortium for FTND is coordinated by Dr Sam Chen, Virginia Commonwealth University; and for DSM-IV diagnoses, nicotine dependence is being analyzed within the framework of the Psychiatric Genomics Consortium (www.pgc.unc.edu). A genome-wide meta-analysis initiative targeting exomic variants has been established by the GWAS & Sequencing Consortium of Alcohol and Nicotine use (GSCAN), coordinated by Dr Scott Vrieze at the University of Michigan (http://gscan.sph.umich.edu/). Although not an exhaustive list of the collaborative efforts needed to make further progress, this brief summary is illustrative of the willingness of the nicotine-dependence genetics community to work together.

Overall, next-generation sequencing will probably provide means to uncover more rare variants explaining differences in smoking-related behavior and predisposition to smoking-related diseases. Be they common or rare variants, identifying genetic loci linked to smoking behaviors is merely the first step in the discovery process. Many associations have not been properly evaluated for their functional relevance or for their public health effect. We can look forward to many years of exciting progress in this area.