Introduction

The active ingredient in pharmaceutical nicotine replacement therapy (NRT) products is nicotine. Millions of people are exposed daily to nicotine by using these smoking cessation products (Royal College of Physicians 2016; Siahpush et al. 2015). In this review, pharmaceutical NRTs are products regulated as drugs or medicines to aid smoking cessation (Royal College of Physicians 2016; US Food and Drug Administration 2013). Pharmaceutical NRT products have been available, for over three decades, with current forms such as chewing gum, lozenge, oral tablet, skin patch, nasal spray and inhaler (Ferguson et al. 2011; Royal College of Physicians 2016). Because daily exposure to these products depends on individual preferences and needs, a broad range of maximum plasma nicotine levels from 6 to 43 ng/ml has been reported (Haussmann and Fariss 2016; Schneider et al. 2001; Shiffman et al. 2005). In the USA, the recommended duration of use for NRT products is 8–12 weeks depending on the product form (US Food and Drug Administration 2013). However, the US Food and Drug Administration (FDA) recently proposed the possibility of a 6-month extension of NRT use limits with healthcare provider consultation (Fucito et al. 2014; US Food and Drug Administration 2013). It seems that uncertainty regarding the potential adverse health effects of long-term use of NRT products (such as cancer) may be, in part, responsible for the proposed increase in duration of use (from 3 to 6 months) being relatively modest (Shields 2011; US Surgeon General 2014). In other countries, including the UK, the medicines regulatory authorities appear to have a more relaxed approach to the long-term use of pharmaceutical NRT products. For example, since 2010, UK smokers have been encouraged to continue NRT use if needed to maintain smoking cessation. In other words, the consumer should decide how long to continue using NRT products (Kostygina et al. 2016; McNeill et al. 2001; Royal College of Physicians 2016).

Nicotine per se is a unique active ingredient for a consumer product in that the majority of nicotine’s effects are mediated by binding and activating nicotinic acetylcholine (nACh) receptors in a wide variety of neuronal (central and peripheral nervous system) and non-neuronal tissue. Consequently, nicotine exposure affects numerous systems, including neurological, neuromuscular, cardiovascular, respiratory, immunological and gastrointestinal. The presence of different types of nACh receptors, receptor up-regulation and receptor desensitization influences these complex physiological effects (Lam et al. 2016; Marks et al. 1987; Renda and Nashmi 2014). Evidence from experimental animal models clearly demonstrate nicotine’s ability to enhance existing tissue injury and diseases such as cancer, cardiovascular disease, stroke, pancreatitis, peptic ulcer, renal injury and developmental (e.g. pulmonary, reproductive and central nervous system) abnormalities (Arany et al. 2011; Bruin et al. 2010; Chowdhury et al. 1995; Hall et al. 2016; Haussmann and Fariss 2016; Lau et al. 2006; Qiu et al. 1991; Rehan et al. 2012; Wang et al. 1997). These reported adverse health effects were observed following short-term exposure to nicotine per se (<12 weeks) and appear to be dependent on nicotine activation of nACh receptors in the affected tissue.

In regard to potential serious adverse health effects (SAHEs) associated with long-term use of pharmaceutical NRT products in humans, the United Kingdom’s National Institute for Health and Care Excellence (NICE) concluded that “evidence is available from studies with up to 5-year follow-up which suggests that ‘pure’ nicotine, in the form available in NRT products, does not pose a significant health risk” (National Institute for Health and Care Excellence 2013). In addition, a recent Surgeon General’s report concluded that inadequate evidence is available “to infer the presence or absence of a causal relationship between exposure to nicotine and risk of cancer” (US Surgeon General 2014). In contrast to these authoritative statements, FDA classifies NRT therapies as a Pregnancy Category “C” or “D” developmental hazard. Category D classification indicates “studies, adequate well-controlled or observational, in pregnant women have demonstrated a risk to the fetus” (Meadows 2001). Furthermore, potential SAHEs with pharmaceutical NRT product use are suggested based on government warning statements for such products. These include warnings against potential major adverse effects in the cardiovascular system (heart disease), gastrointestinal system (stomach ulcers) and diabetes with NRT product use (US Food and Drug Administration 2015a). Unfortunately, a systematic review of the scientific evidence related to potential SAHEs of pharmaceutical NRT products is not available.

The work described here has two main objectives: to identify and critically evaluate relevant studies pertaining to SAHEs in humans, if any, of pharmaceutical NRT and to provide strength of evidence evaluation and conclusions for these effects. For this review, we did not consider evidence relating SAHEs from exposure to snus (or any other smokeless tobacco product) or electronic nicotine delivery systems (ENDS). Currently, these commercial products are regulated as tobacco products and are not developed, approved and licensed as drugs or medicines to aid smoking cessation (Royal College of Physicians 2016; US Food and Drug Administration 2016). The potential adverse health effects of snus have been well studied (Lee 2011, 2013), and ENDS are relatively new products with tremendous diversity in product design and aerosol delivery. Finally, the words pharmaceutical NRT and NRT are used synonymously in this review.

Methods

Inclusion/exclusion criteria

Attention was restricted to epidemiological studies and clinical trials of NRT describing results relating to SAHEs. For this review, SAHEs are defined as adverse events leading to substantial disruption of the ability to conduct normal life functions, including those that lead to hospitalization, significant disability or birth defects, are life-threatening or result in death or require medical intervention to prevent one of the above outcomes (Little and Ebbert 2015; US Food and Drug Administration 2013). These do not include acute side effects such as nausea, vomiting and altered heart rate. Relevant reviews were also sought, partly to look for relevant publications, and partly for citation in this report. No attempt was made to identify individual clinical trials of smoking cessation in healthy individuals, since use of NRT is usually short term and serious adverse events are rare. Rather, we sought published meta-analyses of such trials, although also considering meta-analyses which included a small proportion of studies in diseased subjects.

Clinical and epidemiological studies in which pharmaceutical NRT products were used are included. In this review, pharmaceutical NRTs are products regulated as drugs or medicines to aid smoking cessation (Royal College of Physicians 2016; US Food and Drug Administration 2013). Studies that use any product that contains tobacco are excluded as they are regulated as tobacco products. Epidemiological studies on Swedish snus (a smokeless tobacco) are often used as an indirect measure of the potential adverse health effects of long-term nicotine use in humans (Benowitz 2011). However, for this review, exposure to NRT products and snus (or any other smokeless tobacco product) is not considered equivalent. NRT products are approved or licensed (regulated) as drugs or medicines to aid in smoking cessation, whereas snus and other smokeless tobacco products are regulated as tobacco products (Royal College of Physicians 2016; US Food and Drug Administration 2016). Likewise, studies that use ENDS products are excluded as they are currently undergoing regulatory consideration as tobacco products, not as drugs or medicines for aiding smoking cessation (Royal College of Physicians 2016; US Food and Drug Administration 2016).

Relevant publications were subdivided by type of study (epidemiological, clinical trials) and outcome [cancer, reproductive/developmental, cardiovascular disease (CVD), stroke, other SAHEs seen in patients, and other SAHEs seen in healthy populations].

Literature searches

The first step was a PubMed search on 24 July 2015, using the terms:

(“NRT”[All Fields] OR “nicotine replacement therapy”[All Fields] OR “Nicotine chewing gum”[All Fields] OR “Nicotine patch”[All Fields] OR “Tobacco use cessation”[All Fields] OR “Tobacco use Cessation/methods”[All Fields]) AND (“Kidney disease”[All Fields] OR “Diabetes”[All Fields] OR “GI tract”[All Fields] OR “Stomach”[All fields] OR “Pancreas”[All Fields] OR “Pancreatic”[All Fields] OR “Reproductive”[All Fields] OR “Reproduction”[All Fields] OR “Pregnancy Complications”[Mesh terms] OR “Pregnancy Outcome”[Mesh terms] OR “Birth weight”[Mesh terms] OR “Infant, low birth weight”[Mesh terms] OR “Infant, newborn, diseases”[Mesh terms] OR “child development”[Mesh terms] OR “Cardiovascular disease”[All Fields] OR “Stroke”[All Fields] OR “Cancer”[All Fields]) AND ((“1960/01/01”[PDAT]: “3000/12/31”[PDAT]) AND “humans”[MeSH Terms]).

Abstracts of papers and reviews identified were inspected, with reasons for rejection noted for those clearly irrelevant, and publications of potential interest obtained. Papers read were either accepted or rejected with reasons noted (Table 1).

Table 1 Literature searches and reasons for rejection

Relevant clinical trial reviews in the Cochrane library were also obtained. Reference lists of accepted papers and reviews were then examined and further papers obtained and read, with additional papers accepted and reasons for rejection noted for others (Table 1).

Near finalization of the paper an additional PubMed search was conducted on 30 November 2015 using the same search terms described above.

Assessment of studies

For each topic, a critical assessment form (CAF) was completed for each publication. Notes for completing CAFs are given in Supplementary File 1, together with the completed CAFs. Each form provides a detailed summary of study design, findings and strengths and weaknesses.

There are two types of CAF. The first, for publications describing results of individual studies, is divided into 18 sections: form no.; topic; author(s); title; source; study type; study location; population studied and inclusion criteria; nicotine exposures; treatment groups and sizes; relevant endpoints; confounding variables; other relevant study details; relevant findings; authors’ main relevant conclusions; strengths and weaknesses mentioned; study quality score (for epidemiology studies) or risk of bias (for clinical trials); and comments.

The second, for publications describing results of meta-analyses of SAHEs in clinical trials of healthy people, is divided into 17 sections: form no.; topic; author(s); title; source; meta-analysis type; location of studies; populations studied and inclusion criteria; searches; nicotine exposures; numbers of subjects considered; relevant endpoints; relevant findings; authors’ main relevant conclusions; study quality assessment; strengths and weaknesses mentioned; and comments.

The CAFs are presented in Supplementary File 1 separately by topic (e.g. cancer and reproduction/development) and are numbered consecutively. Exceptionally, where one publication provides results for multiple topics, the CAFs use versions labelled, e.g. 28A, 28B and 28C.

Study quality was assessed as good, fair or poor based on the NIH published quality assessment tools for observational cohort and cross-sectional studies (National Heart Lung and Blood Institute 2014b) and case–control studies (National Heart Lung and Blood Institute 2014a). For clinical trials, risk of bias was assessed using the Cochrane Collaboration’s tool (Higgins et al. 2011). Supplementary File 2 summarizes methods used for both assessment types and gives detailed results for each study. Supplementary Files 3, 4 and 5 give the results from the assessments for, respectively, reproduction/development, CVD and other SAHEs in patients. Supplementary File 6 gives assessments for the meta-analyses of other SAHEs.

Summarizing results for an endpoint

The method varies with the extent of available data and includes a simple textual description, a table of results and/or a meta-analysis. Conducting a meta-analysis requires at least three studies reporting relevant results for the endpoint, but also on the results being reported similarly enough to allow combination over studies.

Results presented are always given with the NRT exposed group as the test group and the non-exposed as the comparison group. Thus, estimates of the RR, odds ratio (OR) or hazard ratio (HR) always relate to the NRT/non-NRT ratio, and mean differences are always NRT minus non-NRT. Generally, the analysis is restricted to those who have smoked, though exceptions where the source paper does not allow this are indicated as appropriate. Often, the NRT versus non-NRT comparison statistic has to be estimated from the source publication using standard methods. Examples where this is necessary include estimating the unadjusted RR and its 95 % confidence interval (CI) in a clinical trial where numbers affected and at risk are available for the NRT and placebo groups; estimating unadjusted mean differences and 95 % CIs from group-specific means and standard deviations; estimating CIs from means and p values; and estimating RRs, ORs or HRs and their 95 % CIs for a more relevant comparison than that reported (e.g. relative to never smokers).

Strength of evidence assessment

This was carried out for each endpoint, and overall for the topic. The criteria for evaluating specific SAHEs were adapted and modified from those outlined by the International Agency for Research on Cancer (IARC) (International Agency for Research on Cancer 2007). A major modification to the IARC strength of evidence assessment is the separation of their classification “evidence suggesting lack of carcinogenicity” into “limited evidence suggesting a lack of effect” and “sufficient evidence of a lack of effect”. With this revised classification in place, the strength of evidence conclusions can be neutral (inadequate evidence) or can be deemed limited or sufficient in both directions, for an adverse effect or a lack of an adverse effect (Haussmann and Fariss 2016). The criteria used were as follows.

Sufficient evidence: A causal relationship has been established in humans between exposure to NRT and this SAHE in humans. A positive association was observed in which chance, bias and confounding factors could be ruled out with reasonable confidence. Conclusive studies have been conducted.

Limited evidence: A positive association was observed between exposure to NRT and this SAHE, but chance, bias and confounding factors could not all be ruled out with reasonable confidence. Conclusive studies are lacking.

Inadequate evidence: The available studies are of insufficient quality, consistency or statistical power to permit a conclusion regarding a positive association between exposure to NRT and this SAHE. This category includes no data or conflicting evidence in multiple studies.

Limited evidence suggesting a lack of effect: A statistically significant association was not observed between exposure to NRT and this SAHE, but a true relationship could not be ruled out with reasonable confidence because of a lack of statistical power, bias and/or confounding.

Sufficient evidence of a lack of effect: A lack of association has been observed between exposure to NRT and this SAHE, based on studies of adequate quality, consistency and statistical power. The possibility of a very small risk at relevant exposure levels can never be excluded.

Results

Literature searches

Table 1 summarizes results of the original searches. Forty-four relevant publications were found, six being reviews used only for searching for secondary references. Of the rest, one related to cancer, 19 to reproduction/development, 10 to CVD, three to stroke, five to other SAHEs on patients, and four were meta-analyses providing results for other SAHEs seen in healthy populations, some publications providing results for more than one endpoint. The additional PubMed search cited 33 more references, but none relevant to this review.

Assessment of studies

Supplementary File 1 gives the completed CAFs, and Supplementary File 2 study-specific information on study quality and risk of bias. Of the 12 epidemiological studies described, two were classified as of “good” quality, eight as “fair” and two as “poor”. Of the 14 clinical trials, eight were considered to have a “low” risk of bias, five a “high” risk, with one classified as “unclear”.

While details of the studies are described by topic below, it should be noted that the follow-up period was generally very short. The longest was the 7½-year follow-up for cancer of participants in the Lung Health Study (Murray et al. 2009) and the follow-up until 2011 for investigating attention-deficit/hyperactivity disorder (ADHD) of children born between 1996 and 2003 in the Danish National Birth Cohort study (Zhu et al. 2014). A follow-up until 2006 for investigating strabismus in the same study (Torp-Pedersen et al. 2010) and two-year follow-ups in two clinical trials (Cooper et al. 2014b; Mohiuddin et al. 2007) were the only other studies appearing to involve more than one year of follow-up.

Type of NRT

Of the 12 epidemiological studies, five (Carandang et al. 2011; Kimmel et al. 2001; Meine et al. 2005; Paciullo et al. 2009; Panos et al. 2010) only considered nicotine patches, while one (Murray et al. 2009) only considered nicotine gum. The remainder considered multiple, or any type of NRT, though results by type of NRT (gum, patch, inhaler) were only reported in two publications from the Danish Birth Cohort (birthweight—Lassen et al. (2010) and stillbirth rate in Strandberg-Larsen et al. (2008)) and in the UK case series analysis (acute myocardial infarction [AMI], stroke and death—(Hubbard et al. 2005)).

Of the 14 clinical trials, three trials (Pollak et al. 2007; Swamy et al. 2009; Thomsen et al. 2010) allowed the subjects allocated NRT to choose the type they preferred, one (Mohiuddin et al. 2007) did not define the type and one (Oncken et al. 2008) allocated subjects to gum. The remainder allocated subjects to nicotine patches. Results by type of NRT product were never reported in any clinical trial.

Of the four meta-analyses considered, one (Greenland et al. 1998) only concerned patches, the others considering results from multiple studies, each involving allocation to one or more NRT types. The limited evidence on risk of SAHEs by NRT type reported in these meta-analyses and in the epidemiological studies is considered in the following sections.

Cancer

Based on findings from animal and mechanistic studies, a case for biological plausibility has been proposed for a potential role of nicotine in carcinogenesis (Cardinale et al. 2012; Grando 2014; Haussmann and Fariss 2016; Improgo et al. 2011). Carcinogenesis is a multistage process that involves three stages: initiation, promotion and progression (Klaunig 2013). For initiation, a substance must exhibit genotoxic properties which have been reported for nicotine at concentrations relevant to NRT use (Haussmann and Fariss 2016). Similarly, nicotine has been reported to stimulate cell proliferation, inhibit apoptosis, induce angiogenesis and inhibit immune function (Cardinale et al. 2012; Grando 2014). Thus, there is considerable evidence that nicotine exposure (at levels relevant to NRT users) can affect many of the cellular processes that are considered important for the initiation, promotion or progression of the carcinogenic process. In fact, a recent comprehensive review concludes that a majority of studies provides sufficient evidence for an association between short-term nicotine exposure and enhanced carcinogenesis of cancer cells inoculated in mice (Haussmann and Fariss 2016). The results from these non-clinical studies clearly support investigating similar nicotine effects with NRT use, especially in smokers or former smokers where initiated cancer cells may be present.

For NRT users, only one useful publication was found (Murray et al. 2009)—see CAF 1. This described follow-up of participants in the Lung Health Study, an RCT of middle-aged volunteers with asymptomatic airways obstruction randomized to a smoking intervention involving encouragement to use nicotine gum. The 7½-year follow-up started after the 5-year intervention period and compared cancer risk in 3315 participants alive and cancer-free at start of follow-up by NRT use in the preceding 5 years. The study quality was rated “good”.

After adjustment for baseline age, sex, cigarettes per day and lifetime pack-years smoking, NRT use was unrelated to lung cancer, gastrointestinal (GI) cancer or all cancer (see Table 2), regardless of adjustment for pack-years cigarette use in the 5 years following randomization. Nor were relationships seen with NRT use when mean daily use was replaced by any NRT use (results not shown). In contrast, pack-years cigarette use was significantly related to lung cancer risk, whether or not adjusted for daily NRT use. The authors noted that the results add “credence to our conclusion that nicotine replacement therapy does not cause cancer”.

Table 2 Results summary for NRT and cancer

Much more limited evidence comes from a 12-month follow-up study of HIV-infected individuals (Elzi et al. (2006), CAF 24). This study reported one lung cancer death among 34 smokers participating in a smoking cessation program where NRT was made available, and one among 383 non-participating smokers, a comparison which is not statistically significant at p < 0.05, or adjusted for the participants having higher baseline cigarette consumption and disease rates than non-participants.

Given the few publications, the limited follow-up for a chronic disease like cancer, and the difficulty of reliably disentangling effects of NRT and smoking, we consider there is inadequate evidence to permit a conclusion regarding an association between exposure to NRT and cancer.

Reproduction/development

Based on findings from published animal studies, sufficient evidence supports an association between short-term nicotine exposure and adverse reproductive and developmental effects (Bruin et al. 2010; Hall et al. 2016; Rehan et al. 2012; Slotkin 2008; Spindel and McEvoy 2016; Wong et al. 2015). This is not surprising since nicotine binds and activates nAChRs, mimicking the effects of the endogenous ligand for this receptor, acetylcholine. It is well known that during the development of the central nervous system, neurotransmitters such as acetylcholine play a critical role in brain assembly from early embryonic stages to early adulthood (Slotkin 2008). These nicotinic acetylcholine receptors also play an important role in coordinating the development of other organ systems including reproductive organs and the lung (Bruin et al. 2010; Rehan et al. 2012; Spindel and McEvoy 2016). As a result, animal studies have consistently reported that prenatal exposure to nicotine can result in deficits in reproductive function (Bruin et al. 2010; Wong et al. 2015), behavioural and cognitive dysfunction such as hyperactivity, cognitive impairment, increased anxiety (Hall et al. 2016; Pauly and Slotkin 2008) as well as pulmonary effects such as impaired lung development and function (Rehan et al. 2012; Spindel and McEvoy 2016). For humans, there are conflicting views about the safe use of pharmaceutical NRT in pregnant women. Warnings from the US Food and Drug Administration place NRT therapies in Pregnancy Categories “C” or “D” as developmental hazards (Slotkin 2008). In contrast, guidance from the UK Committee on Safety of Medicines encourages the use of pharmaceutical NRT products for smoking cessation during pregnancy (Pauly and Slotkin 2008). Thus, the apparent uncertainty in humans and the adverse effects clearly observed in animal studies supports the need for a comprehensive review of studies investigating reproductive/developmental effects in offspring of NRT users.

As summarized in Supplementary File 3, eight publications based on epidemiological studies provided information on NRT and reproduction/development. Fuller study details are presented in CAFs 2 to 9. A cross-sectional study of women interviewed post-natally (Gaither et al. (2009), CAF 8, study quality “fair”) related NRT use in pregnancy to low birthweight and preterm birth of their offspring. Another study (Dhalwani et al. (2015), CAF 9, “fair”) used mother–child primary care records of children born in the UK to relate NRT prescription to incidence of major congenital abnormalities. The remainder (Lassen et al. (2010); Milidou et al. (2012); Morales-Suarez-Varela et al. (2006); Strandberg-Larsen et al. (2008); Torp-Pedersen et al. (2010); Zhu et al. (2014), CAFs 2 to 7, all “good”) derived from the Danish National Birth Cohort, each concerning different endpoints. The analyses all involve births in 1996 to 2003, though the publications vary in inclusion/exclusion criteria, and the comparison groups used for assessing NRT effects, some only reporting risks for women who used NRT and did not smoke. Common weaknesses of some of the analyses include the few relevant events in NRT users and the failure to control for changes in smoking behaviour after starting NRT.

Supplementary File 3 also summarizes 11 publications relating to eight clinical trials of smoking cessation (CAFs 10 to 20). Two (CAFs 13, 15) concern a multicentre trial in North Carolina (risk of bias “unclear”), one (Pollak et al. 2007) reporting results for various endpoints, the other (Swamy et al. 2009) presenting a detailed analysis of adverse events following medical record review. Three (CAFs 16, 19 and 20) concern a multicentre trial in England (risk of bias “low”), Coleman et al. (2012b) giving results for various endpoints recorded at or before birth, Cooper et al. (2014b) reporting on infant and maternal outcomes at 2 years, and Cooper et al. (2014a) presenting an extremely detailed report, but providing no additional relevant results. This is by far the largest study. Of the remaining six studies, two provide little useful information, one (Kapur et al. (2001), CAF 11, risk of bias “low”) terminated prematurely, the other (Schroeder et al. (2002), CAF 12, risk of bias “high”) being small, with no controls. Of the other four studies, one (Oncken et al. (2008), CAF 14, risk of bias “low”) compared nicotine and placebo gum, two (Berlin et al. (2014) and Wisborg et al. (2000), CAFs 10 and 18, both risk of bias “low”) compared nicotine and placebo patches, and one (El-Mohandes et al. (2013), CAF 17, risk of bias “low”) compared cognitive behavioural therapy alone and in conjunction with nicotine patches. A similar comparison was made in the North Carolina study (Pollak et al. 2007; Swamy et al. 2009). The studies provide results for a various endpoints, considered in turn below.

Fetal loss and spontaneous abortion (Table 3): Each effect estimate is based on few exposed cases, none being significant (at p < 0.05). For stillbirth/fetal loss, the meta-analysis estimate is 0.78 (95 % CI 0.45–1.33), with no heterogeneity. Stillbirth risk was also noted to be unaffected by type of NRT in one study (Strandberg-Larsen et al. 2008).

Table 3 Results summary for NRT and reproduction/development—fetal loss and spontaneous abortion

Birthweight (Table 4): Eight studies provide relevant data, one clinical trial giving results in two publications (Pollak et al. 2007; Swamy et al. 2009). Seven studies provided heterogeneous (p = 0.01) estimates for risk of low birthweight, Oncken et al. (2008) reporting a highly significant (p < 0.01) decreased risk of nicotine gum and the others no significant differences. A random-effect meta-analysis showed no overall effect, RR 0.81 (95 % CI 0.48–1.39). Five trials compared mean birthweight in NRT and placebo groups. Substantially higher birthweights associated with NRT exposure, by 200 g or more, were seen in three studies, significantly so (at p < 0.05) in two (Oncken et al. 2008; Wisborg et al. 2000), but not in the other (El-Mohandes et al. 2013). Smaller, non-significant, differences were seen in the other trials (Berlin et al. 2014; Pollak et al. 2007) and also in an epidemiological study (Lassen et al. 2010), which expressed the birthweight change per week of NRT use, and also found no significant variation by type of NRT. The inverse-variance weighted mean difference associated with NRT use was estimated as 142 (95 % CI −53 to 336) grams.

Table 4 Results summary for NRT and reproduction/development—birthweight or low birthweight

Gestational age (Table 5): Of five clinical trials reporting results for gestational age difference, two (El-Mohandes et al. 2013; Oncken et al. 2008) reported a significant increase associated with NRT allocation, the others no effect. The combined weighted estimate was 0.1 (95 % CI −0.5 to 0.6) weeks. Pollak et al. (2007) also reported results for small for gestational age, but based on very few cases.

Table 5 Results summary for NRT and reproduction/development—gestational age/small for gestational age

Head circumference and infant length at birth (Table 6): No significant differences were seen in the two trials reporting results.

Table 6 Results summary for NRT and reproduction/development—head circumference/length of infant

Preterm birth (Table 7): In one epidemiological study (Gaither et al. 2009) results reported relative to non-smokers, for smokers who were or were not prescribed NRT in pregnancy have been converted to an OR within smokers. For consistency with other studies, the OR has been recalculated to compare effects of NRT within smokers. While this OR, 1.88 (0.97–3.56) is close to showing a significant increase in risk, no other result from the six clinical trials does so, one (Oncken et al. 2008) giving a significantly reduced OR (0.39, 0.17–0.91). A meta-analysis of the seven estimates showed no association, the random-effect estimate being 0.98 (0.70–1.37).

Table 7 Results summary for NRT and reproduction/development—preterm birth

Neonatal interim care admissions (Table 8): In the four clinical trials reporting findings no significant differences were seen, the overall random-effect estimate being 0.95 (95 % CI 0.61–1.47),

Table 8 Results summary for NRT and reproduction/development—neonatal intensive care admissions

Neonatal death (Table 9): The extremely limited data from two trials provides no indication of any major effect of NRT.

Table 9 Results summary for NRT and reproduction/development—neonatal death

Congenital abnormalities (Table 10): The two clinical trials (Berlin et al. 2014; Coleman et al. 2012b) reported results only for any congenital abnormality, each finding a non-significantly reduced risk based on quite few cases. An analysis based on the Danish National Birth Cohort (Morales-Suarez-Varela et al. 2006) was unusual in only considering NRT users who did not smoke, not mentioning those who smoked and used NRT in pregnancy. Compared with those who neither smoked nor used NRT, the authors reported a marginally significant increased overall risk, with the adjusted relative rate given as 1.61 (95 % CI 1.01–2.58), which they suggested “needs to be replicated”. The RRs listed in Table 10 are based on comparison with smokers using NRT, this being non-significant 1.50 (0.94–2.41). A much larger UK study (Dhalwani et al. 2015) only reported results for major congenital abnormalities, but found no association with overall risk. Combining the estimate for this study with those for any congenital abnormality in the other studies gave a random-effect estimate of 1.10 (0.86–1.41), which reduced to 1.02 (0.81–1.28) if the result for non-smokers from the Danish cohort was excluded. Results by type of abnormality were only available from the two epidemiological studies. No significant increase was seen for major musculoskeletal abnormalities in the Danish cohort, but the UK study reported a significant increase in respiratory system abnormalities (HR 3.49, 95 % CI 1.40–8.71) though no significant effect of NRT for 10 other groupings. Commenting on the respiratory system finding, the authors noted “the statistical power was limited” and “higher morbidities in those women prescribed NRT may also be an explanatory factor”.

Table 10 Results summary for NRT and reproduction/development—congenital abnormalities

Apgar score (Table 11): Three clinical trials presented results for this measure of the health of a newborn child. Different metrics were used, but no significant NRT effect was seen.

Table 11 Results summary for NRT and reproduction/development—Apgar score

Other endpoints for NRT and reproduction/development (Table 12): No significant relationships or consistent patterns with NRT use were evident from the 10 publications providing results.

Table 12 Results summary for NRT and reproduction/development—other endpoints

Authors’ conclusions varied as to whether NRT was harmful, beneficial or had no effect on reproduction and development, though views expressed were often far from confident. Possible beneficial effects were concluded for three trials (El-Mohandes et al. 2013; Oncken et al. 2008; Wisborg et al. 2000) based on increases in birthweight and/or gestational age in the NRT group. The large SNAP trial (Cooper et al. 2014b) also noted a beneficial effect on development at 2 years. Possible adverse effects were also noted, with the endpoints varying: overall congenital abnormalities (Morales-Suarez-Varela et al. 2006); infantile colic (Milidou et al. 2012); attention-deficit/hyperactivity disorder (Zhu et al. 2014); low birthweight and preterm birth (Gaither et al. 2009); respiratory system abnormalities (Dhalwani et al. 2015); rapid fetal movements (Kapur et al. 2001); and negative birth outcomes (Pollak et al. 2007). However, four of these conclusions (Gaither et al. 2009; Milidou et al. 2012; Morales-Suarez-Varela et al. 2006; Zhu et al. 2014) were based on significant differences from non-smokers, no longer significant when comparisons were made within smokers. Furthermore, one conclusion (Kapur et al. 2001) was based on a single case of rapid fetal movements in a subject given a placebo patch, while differences noted in the final study (Pollak et al. 2007) were not significant. The only study reporting significantly worse outcomes in smokers using (or allocated to) NRT than in other smokers was the increased incidence of major respiratory system congenital abnormalities (Dhalwani et al. 2015) and even here the authors noted “women prescribed NRT were considerably more likely to have diagnosed morbidities, particularly asthma and mental illnesses”. The remaining studies reported finding no effect, some for specific effects and some for a range of effects.

Except possibly for congenital abnormalities of the respiratory system (Dhalwani et al. 2015), the results in Tables 3, 4, 5, 6, 7, 8, 9, 10, 11 and 12 show no evidence of a significantly poorer outcome in NRT users, whether in individual studies or in meta-analyses. Indeed, there is some evidence that allocation to NRT is associated with increased birthweight and gestational age at birth, and better development of their offspring at 2 years. However, despite the quite large number of studies, the evidence has various limitations, including the few NRT exposed cases for some endpoints, the lack of dose–response data on dose or duration of use, the minimal data by type of NRT used, and the failure to adjust for extent and duration of cigarette smoking before NRT prescription, or extent of change in smoking following it.

For some more commonly studied conditions (fetal loss/spontaneous abortion, birthweight/low birthweight, gestational age/small for gestational age, preterm birth, neonatal intensive care admissions and overall incidence of congenital abnormalities), where a lack of significant association was seen in meta-analyses, the data provide limited evidence suggesting a lack of effect of NRT. Limited evidence of an effect is more appropriate for congenital abnormalities of the respiratory system, where the observed positive association may result from confounding by increased morbidity in pregnant smokers allocated to NRT. For other endpoints considered (head circumference/length of infant, neonatal death and Apgar score), an assessment of inadequate evidence seems more appropriate for NRT.

Cardiovascular disease

The cardiovascular effects commonly observed with acute NRT use are an increase in heart rate, blood pressure and cardiac output (Benowitz and Gourlay 1997). These effects are physiological responses to nicotine’s activation of nAChRs, resulting in part from sympathetic neural stimulation and catecholamine release from the adrenal glands (Benowitz and Gourlay 1997; Klevans and Gebber 1970). Because of the rapid development of tolerance to these physiological alterations, it seems unlikely that these acute effects play a significant role in cardiovascular disease (SAHE). In support of this statement, long-term (18–24 months) nicotine exposure (at levels observed in NRT users) does not appear to cause cardiovascular disease in experimental animals (Theophilus et al. 2012; Waldum et al. 1996; Wilson et al. 1938). In contrast, the short-term (≤12 weeks) administration of nicotine to animals has been shown to enhance or aggravate existing cardiovascular disease (i.e. chronic hypertension, atherosclerosis, myocardial ischaemia/reperfusion injury) in experimental models (Bui et al. 1995; Lau et al. 2006; Sridharan et al. 1985; Zhou et al. 2013). The results of these non-clinical studies suggest the potential for serious cardiovascular adverse effects with nicotine exposure and support the need for a comprehensive review of studies investigating the presence of such SAHE in NRT users.

As summarized in Supplementary File 4, six epidemiological studies (CAFs 21 to 26) provided relevant information. One was a case–control study of MI (Kimmel et al. 2001), one a case series analysis (Hubbard et al. 2005), one a prospective study of HIV-infected individuals (Elzi et al. 2006) and three prospective studies of patients undergoing cardiac procedures (Meine et al. 2005; Paciullo et al. 2009; Woolf et al. 2012). Two studies (CAFs 24 and 25) were considered of “poor” study quality, the other four “fair”. Common weaknesses of these studies include few relevant events, short follow-up period and lack of control for confounders, including changes in smoking habits after starting NRT. Four clinical trials of patients with cardiac disease also provided relevant data (CAFs 27, 28A, 29 and 30A). Three trials (Joseph et al. 1996; Rennard et al. 1994; Tzivoni et al. 1998) were double-blind RCTs of nicotine patches, ascribed a “low” risk of bias. The other (Mohiuddin et al. 2007), ascribed a “high” risk of bias, was less informative, only providing results relevant to the effects of an intensive intervention with individualized pharmacotherapy, which only included NRT for some patients, results not being separately presented for those prescribed NRT. Note that one further clinical trial (Thomsen et al. 2010) reported two cases of cardiovascular complications in those allocated to an intervention group including free supply of NRT and one in the control group, as part of a range of complications considered. Including those results, considered in the section on “Stroke”, would not have affected the conclusions for CVD. The studies provide results for various endpoints, as considered below.

AMI (Table 13): Four studies (Joseph et al. 1996; Kimmel et al. 2001; Mohiuddin et al. 2007; Woolf et al. 2012), each based on less than 10 exposed cases, gave RRs (or ORs) less than 1.00, but not significantly so. Another study (Elzi et al. 2006), also on few cases, gave a significantly increased RR of 5.63 (95 % CI 1.07–29.64). This was based on an uncontrolled comparison of HIV patients participating (and not participating) in a smoking cessation program involving supply of NRT, data being presented showing a substantial increase in history of CHD at baseline in the NRT exposed group. An analysis of a UK national database (Hubbard et al. 2005) based on 33,247 individuals prescribed NRT compared AMI incidence in the 56 days pre- and post-prescription with that in the period more than 56 days from the prescription. The high RR of 5.55 (95 % CI 4.42–6.98) in the 56 days before NRT prescription suggests reverse causation, with NRT “being prescribed shortly after myocardial infarctions and strokes”, and is irrelevant to whether NRT might cause AMI. The lower RR of 1.27 (0.82–1.97) for the 56 days after prescription (which did not vary significantly by type of NRT) is more relevant. Excluding the high RR, the remaining estimates give a combined random-effect meta-analysis estimate of 0.97 (0.55–1.71). Also omitting two studies not of NRT per se (Elzi et al. 2006; Mohiuddin et al. 2007), the estimate rises to 1.08 (0.75–1.56), but remains non-significant.

Table 13 Results summary for NRT and CVD–AMI

Mortality (Table 14): For the studies providing relevant results, the period following exposure varied, where stated, from 56 days to 2 years. Since the deaths were shortly after NRT exposure, and in all studies except one (Hubbard et al. 2005) in patients with cardiac disease, most deaths were probably from CVD, though only Mohiuddin et al. (2007) give mortality results specifically for CVD. Although deaths in exposed cases were generally few, only exceeding 10 in the case series analysis (Hubbard et al. 2005), the RRs were, with one exception, below 1.00, though only the RR of 0.23 (95 % CI 0.07–0.73) for the study of intensive intervention versus usual care (Mohiuddin et al. 2007) was significant at p < 0.05. The exception was the study of patients undergoing coronary artery bypass graft surgery (Paciullo et al. 2009), where the adjusted RR, estimated as 6.06 (1.65–22.21), was based on only three exposed cases (of 40 at risk) and one unexposed case (of 489). Note that the unadjusted RR of 2.47 (0.74–8.32) was not significant, the reliability of adjusted analyses based on so few deaths being open to question. Random-effect meta-analysis showed no significant overall effect, the RR being 0.81 (95 % CI 0.41–1.60) based on all the studies and 1.01 (0.51–1.98) omitting the trial of intensive intervention (Mohiuddin et al. 2007). Both analyses showed significant (p < 0.05) heterogeneity due to the single high estimate.

Table 14 Results summary for NRT and CVD—Mortality

Admissions/readmissions (Table 15): Six studies presented results for admissions/readmissions overall or for various heart-related reasons. Numbers of exposed cases tend to be greater than in Tables 13 and 14. Of 15 estimates, five are above 1.00 and 10 below. The only two significant at p < 0.05 were decreases seen in the trial of intensive intervention (Mohiuddin et al. 2007). Due to the variety of endpoints, and the few estimates for any, meta-analysis has not been conducted.

Table 15 Results summary for NRT and CVD—Admissions/readmissions

Other endpoints (Table 16): Two clinical trials presented relevant results, though numbers of exposed cases are often quite low. Of the six effect estimates, one is above and five below 1.00, with none close to statistically significant (at p < 0.05). In contrast, the study of HIV-infected individuals (Elzi et al. 2006) gave an RR of 7.51 (95 % CI 1.30–43.41) for undergoing coronary angioplasty. However, as for AMI, the analysis took no account of the much higher baseline history of CVD in the group given NRT.

Table 16 Results summary for NRT and other CVD endpoints

Conclusions reached by the authors on NRT and CVD were generally not specific to an endpoint. Two (El-Mohandes et al. 2013; Kimmel et al. 2001) were quite positive that NRT (or intensive intervention) had a beneficial effect, while most authors simply argued no effect was demonstrated. The exception was Paciullo et al. (2009), who noted the significant increase in mortality, but commented that “additional evaluation in large patient cohorts with prospective controls is warranted”.

Except possibly for findings from two “poor” quality studies (Elzi et al. 2006; Paciullo et al. 2009), based on few exposed cases, the overall evidence suggests NRT is not associated with any increased CVD risk. However, it cannot be regarded as conclusive of a lack of effect. Limitations of the evidence include small numbers of exposed cases in most studies, some analyses not relating to actual use of NRT, the short follow-up period, the absence of dose–response data, and the failure to adjust for smoking either before or after NRT prescription. We conclude there is limited evidence suggesting a lack of effect concerning the relationship of NRT use to the risk of CVD.

Stroke

NRT product warning label statements indicate a potential health risk to NRT users with high blood pressure (US Food and Drug Administration 2015a). Such warnings suggest a potential risk of stroke with NRT or nicotine exposure. In fact, several animal studies support a potential role for nicotine in increasing cerebral microvessel thrombosis (Fahim et al. 2014) and inhibiting restoration of brain microvascular flow (Wang et al. 1997) in experimental models of thromboembolic injury and repair (as compared to vehicle controls without nicotine). Three published studies from separate laboratories (Bradford et al. 2011; Paulson et al. 2010; Wang et al. 1997) also investigated the impact of nicotine exposure on cerebral ischaemia/reperfusion injury in experimental animal models. Similar to the findings observed for myocardial ischaemia/reperfusion injury, short-term nicotine exposure increased cerebral infarct size. Based on findings from non-clinical studies described above, we evaluated epidemiological studies and clinical trials for an association between stroke and NRT use.

Only three studies (Hubbard et al. 2005; Joseph et al. 1996; Panos et al. 2010) provide information on NRT and stroke, fuller details being presented in CAFs 22B, 28B and 31. The first (Hubbard et al. (2005), study quality “fair”) was the case series analysis which also reported results on CVD and suffers from weaknesses noted above. The second (Joseph et al. (1996), risk of bias “low”) was the clinical study of patients with cardiac disease also reporting results for CVD and other SAHEs. The third (Panos et al. (2010), study quality “fair”) was a prospective study of patients admitted to an intensive care unit with a neurological insult. Note that one study (Carandang et al. 2011) which reported on the relationship between patch use and various adverse health effects in subarachnoid haemorrhage patients is considered in the section below on “Other SAHEs in patients”.

Table 17 summarizes the main results. The high RR (compared to baseline levels) of 3.59 (95 % CI 2.56–5.03) in the 56 days before NRT prescription for the case series analysis (Hubbard et al. 2005) suggests reverse causation and is irrelevant to whether NRT can cause strokes. The lower RR of 1.30 (0.77–2.19) for the 56 days after prescription is more relevant. There was no evidence of an effect of NRT in the 56 days post-prescription in subgroups by sex, age, previous history of angina or hypertension, or by type of NRT used, nor when analyses were repeated using second stroke as the outcome. An RR of 2.47 (1.16–5.24) in days 43–56 after prescription was regarded by the authors as an “isolated increase”, no evidence of an increased risk being seen for days 1–14, 15–28 or 29–42. No evidence of an effect of NRT was seen in the small clinical trial (Joseph et al. 1996) or in the study of neurological patients (Panos et al. 2010). None of the authors of the three publications claimed any adverse effect on stroke had been established.

Table 17 Results summary for NRT and stroke

While the results provide no clear evidence of any increased risk of stroke, one cannot regard the findings as conclusive of a lack of effect. This is because of the few publications and cases of stroke post-NRT and various study weaknesses. We consider there is inadequate evidence concerning the relationship of NRT to stroke.

Other SAHEs in patients

NRT product warning label statements (health-related) indicate potential risk to users with gastrointestinal (GI) conditions (such as stomach ulcers) and diabetes. For diabetes, several human studies demonstrate nicotine exposure results in enhanced insulin resistance in type 2 diabetics but not in healthy individuals (Axelsson et al. 2001; Epifano et al. 1992). For example, a study from Axelsson and co-workers investigated the effects of acute intravenous nicotine infusion on insulin sensitivity in non-smoker type 2 diabetics versus healthy non-smoker controls. While subjects with type 2 diabetes were more insulin resistant than healthy controls at baseline, nicotine infusion significantly decreased insulin sensitivity in type 2 diabetics but did not modify the insulin response in controls (Axelsson et al. 2001). In addition to diabetes, studies using well-known experimental models for peptic ulcer formation, pancreatic injury and renal injury have also demonstrated that short-term nicotine exposure has the potential to aggravate all of these pathological conditions (Arany et al. 2011; Chowdhury et al. 1995; Qiu et al. 1991; Wong et al. 2002). Therefore, based on findings from clinical and non-clinical studies described above, we evaluated epidemiological studies and clinical trials for an association between NRT use and other SAHEs such as diabetes, peptic ulcer, pancreatitis and renal effects.

As summarized in Supplementary File 5, one epidemiological study (Carandang et al. 2011) and four clinical trials (Joseph et al. 1996; Lee et al. 2013; Mohiuddin et al. 2007; Thomsen et al. 2010) of patients provide information relating NRT and other SAHEs not generally related to cancer or CVD. Further details are presented in CAFs 28C, 30B, 32, 33 and 34. The epidemiological study (Carandang et al. 2011), in patients with subarachnoid haemorrhage, was ascribed study quality “fair”. Three clinical trials (Lee et al. 2013; Mohiuddin et al. 2007; Thomsen et al. 2010) were ascribed a “high” risk of bias for only providing results for an intervention which included use of NRT (not necessarily in all patients), the results possibly reflecting possible effects of intervention rather than of NRT. The other trial (Joseph et al. 1996), which did allocate patients to NRT, was ascribed risk of bias “low”. Common weaknesses include the small number of relevant events, the relatively short follow-up period and the failure to control for changes in smoking habits following starting NRT. A variety of endpoints were considered, including complications following surgery and readmissions to hospital.

The main results are summarized in Table 18, some further results being given in the CAFs. The only significant (p < 0.05) differences noted in relation to NRT use (or intervention) were for clinical vasospasm and poor outcome in the study of subarachnoid haemorrhage patients (Carandang et al. 2011), which was less frequent (OR 0.46, 95 % CI 0.25–0.84) in patients using patches. Total length of stay in hospital was also significantly shorter in patch users in this study (see CAF 32). There was no significant adverse effect of NRT use (or intervention) for any other endpoint. The authors generally agreed in regarding the results as not demonstrating an adverse effect of NRT for intervention.

Table 18 Results summary for studies in patients on NRT and other serious health effects

The available evidence does not indicate any effect of NRT. However, this cannot be regarded as conclusive due to the weaknesses noted above. We conclude there is inadequate evidence concerning the relationship of NRT to the risk of the SAHEs considered here.

Other SAHEs, mainly in healthy populations

As summarized in Supplementary File 6, four meta-analyses (Greenland et al. 1998; Mills et al. 2010; Moore et al. 2009; Stead et al. 2012) provided relevant results, more details being given in CAFs 35–38. Most studies considered in the meta-analyses were of trials conducted in healthy populations, but some were of patients with specified conditions. Three meta-analyses (Greenland et al. 1998; Moore et al. 2009; Stead et al. 2012) limited attention to RCTs, the other (Mills et al. 2010) also including results of observational studies. One meta-analysis (Moore et al. 2009) was limited to RCTs of smokers declaring no intention to quit, concerned smoking reduction and involved only seven RCTs. The other meta-analyses concerned smoking cessation studies and involved far more studies, the more recent meta-analyses (Mills et al. 2010; Stead et al. 2012) combining data from over 100 studies. For two meta-analyses (Greenland et al. 1998; Mills et al. 2010), adverse events were clearly the central interest. However, one (Moore et al. 2009) was mainly concerned with efficacy and safety, and another (Stead et al. 2012) with smoking cessation, both giving very limited results for adverse events.

The main findings of the four meta-analyses are summarized in Table 19. Where prevalences were only given for the NRT groups, these are noted in CAF 37, but not included in Table 19, nor considered further here. As listed in Table 19, no significant relationships between NRT and any health effect were reported in the earliest meta-analysis (Greenland et al. 1998), based on quite low total incidences, or in the meta-analysis of smoking reduction studies (Moore et al. 2009). However, the recent meta-analyses (Mills et al. 2010; Stead et al. 2012), no doubt based on sets of studies which overlap considerably, both reported a significant (p < 0.01) approximate doubling of risk of heart palpitations and chest pains in subjects allocated to NRT. Mills et al. (2010) noted that the increased risk was evident both for patches and orally administered NRT and refer to high nicotine concentrations in the serum of NRT users who continue to smoke, and pre-existing CVD as possible explanations for the association.

Table 19 Results of meta-analyses of NRT and other serious adverse health effects

The first two reviews saw no evidence of harm from NRT. However, the third review (Mills et al. 2010) noted that “The use of NRT is associated with a variety of side effects”. However, the side effects referred to, for example, nausea and insomnia are generally not considered serious. Even heart palpitations and chest pains are not generally considered SAHEs (Little and Ebbert 2015; Stead et al. 2012; US Food and Drug Administration 2015a). In agreement, the final review (Stead et al. 2012) noted that “there is no evidence that NRT increases the risk of heart attacks”. Commenting on the results for heart palpitations and chest pains, they stated “this is potentially the only clinically significant serious adverse event to emerge from the trials and constitutes an extremely rare event, occurring at a rate of 2.5 % in the NRT groups compared with 1.4 % in the control groups in the 15 trials in which it was reported at all”.

The third review (Mills et al. 2010) points out various relevant limitations, which apply generally: “These include limitations of the primary studies themselves as well as those associated with combining results across potentially heterogeneous studies or populations. The main limitation of the primary studies is the mechanism by which adverse events are recorded. In the majority of instances this would be through passive reporting and therefore be susceptible to the underreporting associated with such techniques”. Nevertheless, the magnitude and significance of the association of NRT use in RCTs with the reported incidence of heart palpitations and chest pains strongly suggest a causal relationship, though whether this increased risk is limited to patients with pre-existing CVD is unclear from the available evidence. There appears to be sufficient evidence of a relationship between NRT and the incidence of heart palpitations and chest pains, but these outcomes are not considered SAHEs. For SAHEs, however, the data can generally be regarded as limited evidence suggesting a lack of effect of NRT, though for the more rarely seen conditions inadequate evidence seems a more appropriate assessment.

Discussion

For many endpoints considered, the evidence presented is clearly inadequate to reach a reliable conclusion on whether NRT has an effect. This includes cancer, stroke, the serious health effects in patients considered in Table 18 and some of the less commonly considered reproduction/development endpoints. For some more commonly studied reproduction/development endpoints (including fetal loss, spontaneous abortion, birthweight, prematurity, neonatal intensive care admissions, overall incidence of congenital abnormalities and ADHD) and also for CVD, the evidence available suggests a lack of effect of NRT. Only for two endpoints is there any apparent evidence of harm. One, only providing limited evidence, given the association may result from confounding by higher morbidities in women prescribed NRT, is congenital abnormalities of the respiratory system where an increased incidence associated with NRT was seen in one study (Dhalwani et al. 2015). The other is heart palpitations and chest pains where a significant doubling of risk was reported in two recent meta-analyses of evidence from randomized controlled studies (Mills et al. 2010; Stead et al. 2012), which we regard as sufficient evidence of an effect, though it is not a SAHE.

Limitations

Numerous limitations affect the available evidence on possible adverse health risks from NRT use. First consider the evidence from the clinical trials. These typically compare a defined group of smokers allocated to receive NRT or a placebo and are designed to investigate effects of allocation on cessation rates. For any given outcome, there are three possible results, a significant increase in risk of a SAHE, a significant decrease in risk, or no significant change.

Interpreting a significant increase in risk seems relatively straightforward. Unless due to chance, to confounding (unlikely to be relevant in an RCT) or to an effect of smoking withdrawal in smokers who quit as a result of using NRT, it provides direct evidence of an effect of NRT.

The finding of no significant difference in risk can have various interpretations. One obvious possibility is the study was too small, particularly likely for relatively rare outcomes, since clinical studies are generally powered to detect effects on cessation, not on health outcomes. A second possibility is that actual usage of NRT was only for a very limited period. A third is that the follow-up period may have been too short. Especially for chronic diseases, risks may, for some time, depend predominantly on past exposure, before allocation to NRT.

The finding of a significant decrease in risk of a SAHE in the NRT group can also have various interpretations and is hindered by the lack of corresponding results for never smokers. Thus, the endpoint may be related to components of smoke other than nicotine, with the observed reduction due to greater quitting in the NRT group. Alternatively, the endpoint may be increased by nicotine exposure, with the reduced risk due to the reduced dose of nicotine, either as a result of increased quitting in the NRT group, or to the NRT delivering less nicotine than cigarettes.

There are also other issues which limit interpretation. For instance, some studies are of a general cessation intervention including NRT, and not just of NRT, so risk differences may result from other aspects of the intervention. Also, some studies involve a group prescribed or recommended NRT, with many subjects possibly ignoring the advice and never using NRT. Another issue is that the analyses typically compare the groups initially allocated (as “intention-to-treat” analyses), not distinguishing risks in those who quit, cut down, or continue to smoke as before. Nor do they distinguish events in periods where the subjects were still using NRT and in those where they had quit. Also, there is very little information on risk by type of NRT used, or on dose–response, with results not related to the prescribed NRT dose. Finally, there is inconsistent reporting of adverse events, with differing classifications in different studies.

Turning now to epidemiological studies, a major issue is confounding by other variables, which affect risk of an SAHE and differ between the NRT and non-NRT groups. However, epidemiological studies can provide results for never smokers, or for those who both smoke and use NRT, and a few studies do adjust for changes in smoking post-NRT. If NRT users have no significant excess risk compared to non-smokers, then this suggests no effect of NRT, though issues of power and confounding remain. If, alternatively, similar increases are seen in NRT users and smokers, then this may mean risks are nicotine-related, with the dose of nicotine being unchanged. However, it may also mean the study was too small or too short term to detect differences in risk of an outcome resulting from chronic exposure. An increase in risk in the NRT group, but less than in smokers, could be because quitting reduces the dose of nicotine, or of other tobacco components. Clearly, the overall evidence is limited, with some findings difficult or impossible to interpret reliably in terms of an effect (or lack of effect) of NRT.

Comparison with Reproductive/Developmental Reviews

It is interesting to compare our findings with those of some of the reviews we identified. The first, published in 2008, was mainly concerned with effects of smoking (Rore et al. 2008) and included little on possible health effects of NRT, though it mentioned one study (Wisborg et al. 2000) as providing evidence “that nicotine as present in NRT products does not have an adverse effect on infant weight”, and another (Kapur et al. 2001) as suggesting that rapid fetal movements seen in a single pregnant woman may have been due to nicotine withdrawal. Another review, concerning the role of pharmacotherapy for smoking cessation, was published in 2009 (Oncken and Kranzler 2009). Again, no clear conclusions were drawn, most attention being given to NRT use possibly leading to an increase in birthweight. They considered the higher congenital malformation rates compared to non-smokers in the Danish Birth Cohort Study (Morales-Suarez-Varela et al. 2006) “should be interpreted with caution” and also noted the lack of relationship with stillbirth (Strandberg-Larsen et al. 2008).

In 2010, two reviews were published. One, which concerned long-term consequences of fetal and neonatal nicotine exposure (Bruin et al. 2010), evaluated evidence from six studies we considered (see CAFs 2, 3, 8, 10, 13 and 14), but reached no overall conclusions, except that “the safety of NRT use during pregnancy has been evaluated in a limited number of short-term human trials, but there is currently no information on the long-term effects of developmental nicotine exposure in humans”. The other, “nicotine replacement therapy effect on pregnancy outcomes” (Forinash et al. 2010), considered five of the studies (CAFs 2, 10, 12, 13, 14), the authors concluding that “if behaviour modification therapy is attempted without success, NRT should be offered because of decreased risk of low birthweight and preterm delivery compared to continued smoking. Additionally, NRT does not appear to increase the risk of malformations”.

Two further reviews, in 2011 and 2012, were both by the same group. The first (Coleman et al. 2011) concerned efficacy and safety of NRT for smoking cessation during pregnancy, while the second (Coleman et al. 2012a) was a Cochrane review of pharmacological interventions for promoting smoking cessation during pregnancy. The first review (Coleman et al. 2011) considered five RCTs, one (Hotham et al. 2006) giving no relevant safety results, the other four considered in CAFs 10, 11, 13 and 14. The authors concluded “there is currently insufficient evidence to determine whether or not nicotine replacement therapy is effective or safe when used in pregnancy for smoking cessation; further research and, in particular, placebo-randomized controlled trials are required”. The Cochrane review was also restricted to RCTs, considering one additional trial (CAF 16). The conclusion was essentially the same.

Comparison with Cardiovascular Disease Reviews

The only relevant review for CVD (Mills et al. 2014) concerned cardiovascular events associated with smoking cessation pharmacotherapies and was based on RCTs of NRT, bupropion and varenicline. The authors concluded “there was an elevated risk associated with nicotine replacement therapy that was driven predominantly by less serious events (RR 2.29; 95 % CI 1.39–3.82). When we examined major adverse cardiovascular events, we found … no clear evidence of harm with … nicotine replacement therapy (RR 1.95; 95 % CI 0.26–4.30)” (though the relevant table in the review gives CI of 0.92–4.30. These estimates derived from a “network meta-analysis”, incorporating also results from studies comparing results from other comparisons, such as bupropion (or varenicline) versus placebo, or versus NRT. A more conventional meta-analysis based simply on the 21 RCTs of NRT versus placebo gives a lower estimate, 1.38 (0.58–3.26), with a similar value, 1.48 (0.42–5.19), based on the three RCTs in high-risk patients.

Relevance to nicotine-based tobacco products

We suggest that the conclusions from this systematic review on the potential SAHEs of pharmaceutical NRT use may help to predict health effects of nicotine exposure from nicotine-based tobacco products. Like NRT products, nicotine-based tobacco products contain nicotine as an active ingredient, do not contain tobacco and deliver nicotine via several different routes (oral and inhalation at present). Unlike NRT products, nicotine-based tobacco products are regulated as tobacco products and not licensed as drugs or medicines. Examples include products such as ENDS (e-cigarettes). Though ENDS products are relatively new, about 4 % of adults in the USA (US) and in the UK currently use e-cigarettes every day or some days (Pearson et al. 2012; Royal College of Physicians 2016; Schoenborn and Gindi 2015). Unfortunately, we know little about the potential SAHEs with use of these new products.

Even though these nicotine-based products are reportedly used for different purposes (smoking cessation or recreation), there appear to be many similarities in the nicotine delivery between NRT products and nicotine-based tobacco products. First, the source of added nicotine in all of these products is the same, pharmaceutical grade, derived from tobacco (Flora et al. 2015). Secondly, the route of exposure and the average maximum plasma nicotine concentration (Cmax) in the blood of NRT users (4 mg nicotine gum or inhaler) and ENDS users are similar, typically ranging from 5 to 40 ng nicotine/ml plasma (D’Ruiz et al. 2015; Haussmann and Fariss 2016; Schneider et al. 2001; St. Helen et al. 2015; Vansickel and Eissenberg 2013). Thus, as potential serious adverse health effects of nicotine per se are presumably dependent on the inherent toxicity of the active ingredient, nicotine per se and the level of exposure, these two parameters appear to be very similar in both NRT and nicotine-based products. Unfortunately for commercially available ENDS products, the enormous diversity in product design and aerosol delivery makes generalization about potential adverse health effects of these products challenging. Recent regulatory guidance in the US, UK and Europe, however, should standardize this category, resulting in well-characterized commercial ENDS products in the near future (Royal College of Physicians 2016; US Food and Drug Administration 2016).

Our review does not consider evidence on the potential adverse health effects related to long-term use of smokeless tobacco (SLT) including snus as a surrogate for NRT use. Firstly, these products differ in that a SLT user is not only exposed to nicotine extracted from this tobacco product, but also exposed to extracted compounds that may stimulate or mask a potential toxic effect of nicotine (Gong et al. 2014; Hecht et al. 1986; Hoffman et al. 1987; Prokopczyk et al. 1987). Secondly, extensive evidence on the potential adverse health effects of snus use has been thoroughly reviewed in various publications. For example, epidemiological evidence (Lee 2011, 2013) clearly demonstrates that snus use is not associated with an increased risk of cancer, heart disease or stroke, and provides scant support for any major adverse health effect. These findings are consistent with nicotine having very little effect on risk of SAHEs.

Comparison with authoritative body conclusions

Overall conclusions

The only SAHE from NRT exposure we identified is an increased incidence of respiratory congenital abnormalities reported in one study. For many of the SAHEs considered, inadequate scientific evidence was available to determine reliably whether NRT has an effect. Thus, except for the observed developmental effect, we consider the scientific evidence, to date, does not support an association between NRT exposure and SAHEs (Table 20).

Table 20 Strength of evidence for seriousa adverse health effects of NRT

The conclusions of our review agree with statements published by several authoritative bodies including Royal College of Physicians, NICE, FDA, and the US Surgeon General (National Institute for Health and Care Excellence 2013; Royal College of Physicians 2016; US Food and Drug Administration 2013; US Surgeon General 2014). For example, FDA (US Food and Drug Administration 2013) stated that they “examined the use of NRT products over periods larger than 12 weeks” and “and have not identified any safety risks associated with such use”. In addition, a NICE report (Jones et al. 2012) concluded that “no authors have attributed serious adverse events to NRT when used as part of smoking harm reduction”.

Cancer

We consider the strength of scientific evidence inadequate to associate NRT exposure with cancer (Table 20). Only one well-conducted study was identified, which found no evidence for an effect of NRT exposure on lung cancer, GI cancer or overall cancer (Murray et al. 2009). While a study clearly demonstrating a lack of adverse effect (cancer) associated with NRT exposure would normally lead to a conclusion of “limited evidence for a lack of serious adverse effect”, our assessment of inadequate evidence took into account the study’s limited follow-up time and the difficulty of reliably separating effects of NRT from those associated with a reduction in smoking.

Similar conclusions were reached from recent reports from the US Surgeon General (2014) which stated that “there are insufficient data to conclude that nicotine causes or contributes to cancer in humans” and from the National Institute for Health and Care Excellence (2013) which stated that “the results of this multicentre randomized controlled trial (Murray et al. 2009) suggest that long-term use of NRT is not associated with an increased incidence of harm, including cardiovascular events or cancer, with the latest analysis of outcome at 12.5 years from study outset”.

Reproductive/developmental effects

Based on one epidemiology study that demonstrated an increased incidence in respiratory congenital abnormalities, we consider there is limited evidence of a serious reproductive/developmental effect (lung development) with NRT exposure (Table 20). It is interesting to note that numerous animal studies have demonstrated an association between prenatal nicotine exposure and lung development abnormalities and offspring respiratory disease (Rehan et al. 2012; Spindel and McEvoy 2016).

In addition to congenital abnormalities (birth defects), nine additional endpoints were examined in numerous epidemiology and clinical studies. These showed no evidence of a significantly poorer outcome in NRT users. For some commonly studied conditions (fetal loss and spontaneous abortion, birthweight, gestational age, preterm birth, neonatal intensive care admissions and overall incidence of birth defects), the strength of evidence indicates limited evidence of a lack of effect from NRT exposure (Table 20). In fact, some evidence suggests infants born to pregnant smokers allocated to NRT use have increased birthweight and gestational age at birth, and better development of their offspring at 2 years. For the other endpoints considered (head circumference/length of infant, neonatal death and Apgar score), an assessment of inadequate evidence was reached.

FDA classifies NRT therapies as Category “C” or “D” developmental hazards depending on the type of NRT product. For example, nicotine gum is given a Pregnancy Category “C” classification, indicating that animal studies demonstrating an adverse effect may or may not have been conducted as well as the absence of adequate and well-controlled studies in pregnant women. In contrast, nicotine patches and inhalers are classified as a Pregnancy Category “D”, indicating that there are adequate well-controlled studies in pregnant women demonstrating a risk to the fetus (Dempsey and Benowitz 2001; Meadows 2001). These FDA warnings suggest a potential difference in risk between different types of NRT products. Our critical examination of the literature does not support such a difference. In fact, most published studies do not report the type of NRT product used. In the few studies clearly distinguishing types of NRT product, no difference was seen for reproductive/developmental end points such as birthweight and stillbirths, according to whether nicotine gum, patch or inhaler was used (Lassen et al. 2010; Strandberg-Larsen et al. 2008).

A recent report (US Surgeon General 2014) states “the evidence is sufficient to infer that nicotine adversely affects maternal and fetal health during pregnancy, contributing to multiple adverse outcomes such as preterm delivery and still birth”. However, this conclusion was inferred from studies of smokeless tobacco users, not from use of NRT products or nicotine per se delivery systems. Unlike NRT, the use of smokeless tobacco products results in exposure to tobacco extracts that contain, in addition to nicotine, numerous potential cytotoxic and cytoprotective compounds (Hoffman et al. 1987). Such a complex mixture complicates the interpretation of the study results with regard to the role nicotine per se in the observed adverse effects.

Cardiovascular disease

We consider there is limited evidence for a lack of serious cardiovascular effects associated with NRT use (Table 20). The cardiovascular effects examined in numerous epidemiological and clinical studies include the effect of NRT exposure on AMI, deaths caused by CVD and hospital admissions or readmissions for various cardiac-related issues. In general, no significant overall effect was observed, though many studies had limitations such as few relevant events, short follow-up and failure to control for changes in smoking behaviour subsequent to NRT use. The type of NRT product used did not appear to have a significant influence on serious adverse cardiovascular effects observed (Hubbard et al. 2005).

As evidence for a lack of increased incidence of or effect on cardiovascular disease, NICE (National Institute for Health and Care Excellence 2013) cite six studies evaluating the safety of NRT products in patients with cardiac disease and five randomized trials evaluating biomarkers of potential cardiac disease in the plasma of patients treated with NRT for smoking cessation.

Other serious adverse health effects

NRT product warning label statements (health-related) provide authoritative bodies the opportunity to communicate their concern about the potential harmful effects of these products. For NRT products, the labels warn about potential risks to the child (reproductive/developmental), the heart (cardiac disease), the vascular system (high blood pressure), the digestive system (stomach ulcers) and diabetes.

Based on these warnings and the findings from non-clinical studies, we expanded our literature search to include terms associated not only with CVD, reproductive and developmental effects and cancer but also with diabetes, stroke, GI tract, stomach, pancreas, pancreatic and kidney disease. These other SAHEs following NRT exposure were investigated in patients and healthy population studies. Our strength of evidence assessment concluded there was inadequate evidence to associate NRT exposure with stroke or these other SAHEs (Table 20).

Conclusions

We critically evaluated 34 epidemiological studies and clinical trials relating NRT exposure to cancer, reproductive/developmental effects, CVD, stroke and/or other SAHEs in patients and in healthy populations. The overall evidence suffers from many limitations, the most significant being short-term exposure (≤12 weeks) and follow-up to NRT product use in most of the studies and the common failure to account for changes in smoking behaviour following NRT use. The only SAHE from NRT exposure we identified was an increased incidence of respiratory congenital abnormalities reported in one study. Other findings include limited evidence for a lack of any SAHE of NRT use on CVD and a range of reproduction/developmental endpoints. For cancer, stroke and other SAHEs, the evidence was judged inadequate. Though data are limited, there is no evidence that observed associations (or lack of associations) vary by type of NRT product (gum, patch, inhaler) used. Our overall conclusions from this systematic review agree with recent statements from authoritative bodies including FDA, the US Surgeon General, Royal College of Physicians and NICE.