Cancer Causes & Control

, Volume 19, Issue 3, pp 317–328

Analysis of lung cancer incidence in the nurses’ health and the health professionals’ follow-up studies using a multistage carcinogenesis model

Authors

  • Rafael Meza
    • Division of Public Health SciencesFred Hutchinson Cancer Research Center
  • William D. Hazelton
    • Division of Public Health SciencesFred Hutchinson Cancer Research Center
  • Graham A. Colditz
    • Channing LaboratoryBrigham and Women’s Hospital and Harvard Medical School
    • Siteman Cancer CenterWashington University Medical School
    • Division of Public Health SciencesFred Hutchinson Cancer Research Center
Original Paper

DOI: 10.1007/s10552-007-9094-5

Cite this article as:
Meza, R., Hazelton, W.D., Colditz, G.A. et al. Cancer Causes Control (2008) 19: 317. doi:10.1007/s10552-007-9094-5

Abstract

We analyzed lung cancer incidence among non-smokers, continuing smokers, and ex-smokers in the Nurses Health Study (NHS) and the Health Professionals Follow-Up Study (HPFS) using the two-stage clonal expansion (TSCE) model. Age-specific lung cancer incidence rates among non-smokers are identical in the two cohorts. Within the framework of the model, the main effect of cigarette smoke is on the promotion of partially altered cells on the pathway to cancer. Smoking-related promotion is somewhat higher among women, whereas smoking-related malignant conversion is somewhat lower. In both cohorts the relative risk for a given daily level of smoking is strongly modified by duration. Among smokers, the incidence in NHS relative to that in HPFS depends both on smoking intensity and duration. The age-adjusted risk is somewhat larger in NHS, but not significantly so. After smokers quit, the risk decreases over a period of many years and the temporal pattern of the decline is similar to that reported in other recent studies. Among ex-smokers, the incidence in NHS relative to that in HPFS depends both on previous levels of smoking and on time since quitting. The age-adjusted risk among ex-smokers is somewhat higher in NHS, possibly due to differences in the age-distribution between the two cohorts.

Keywords

Lung cancer epidemiologyLung cancer age-specific incidenceNever smokers lung cancer riskSmokers relative riskEx-smokers relative riskMultistage carcinogenesisTwo-stage clonal expansion model

Introduction

The Nurses Health Study (NHS) and the Health Professionals Follow-Up Study (HPFS) constitute outstanding dataset, to investigate in detail the relationship between smoking and lung cancer, and to evaluate the influence of gender both on background and smoking-induced risks. We analyze the consequences of smoking and smoking cessation on the lung cancer incidence rates in the NHS and HPFS using multistage carcinogenesis models. This approach allows us to explicitly consider the entire smoking histories of individuals in these cohorts, including complex time-related factors, such as ages at start and quit, and changes in smoking habits.

We use likelihood-based methods to estimate the parameters of the two-stage clonal expansion (TSCE) model. Using the model with the estimated parameters, we construct age-specific incidence curves for non-smokers, and for smokers and ex-smokers with pre-specified histories of smoking. We investigate also the roles of daily intensity of smoking and of duration of smoking on lung cancer risk. In particular, for a given level of smoking, we examine the impact of duration of smoking on the relative risk (RR).

The question of whether, for a given level of smoking, females are at greater risk than males of developing lung cancer has generated a great deal of debate [16]. The NHS consists entirely of females and the HPFS consists entirely of males. Our methods allow us to analyze the NHS and HPFS simultaneously using a common model, hence we can evaluate in a single-framework similarities and differences in the lung cancer risk in females and males.

Finally, we use our fitted models to project lung cancer risks for various smoking scenarios.

Methods

The nurses health and the health professionals follow-up studies

The Nurses Health Study (NHS) was established in 1976 by Dr. Frank Speizer. The cohort consists of 121,700 nurses aged 30–55 at the beginning of follow-up. Every two years the nurses post mail questionnaires about diseases, smoking status, hormone use, and diet, among many other health-related issues. The Health Professionals Follow-up (HPFS) study was established in 1986 by Dr. Walter Willett. The cohort consists of 51,529 men in the health professions aged 40–75 at the beginning of follow-up. As in the NHS, the health professionals receive questionnaires every two-years about diseases and health-related topics like smoking, physical activity, and medications taken. In addition, the participants respond to questionnaires about their diet every four years. Less than 10% of subjects in the NHS and 7% in the HPFS have been lost to follow-up. We exclude from our analysis individuals who do not have complete smoking information, non-Caucasians and those with prior history of cancer (other than non-melanoma skin cancer). Table 1 describes the subpopulations of both cohorts that we use in our analysis.
Table 1

Classification according to status at baseline (NHS-1976, HPFS-1986)

 

Total

Never

Former

Smokers

NHS

    Subjects

104,493

51,121

24,474

28,898

    Lung cancer cases

1,165

130

134

901

    Avg. follow up

23.15 years

   

HPFS

    Subjects

46,050

22,431

19,632

3,987

    Lung cancer cases

461

58

247

156

    Avg. follow up

14.93 years

   

It is important to mention that although there was a decade between the beginning of these studies, the age, and birth year distributions of both cohorts are quite similar.

Smoking histories

Smoking histories, which we denote d(t), are piecewise constant functions representing the number of cigarettes per day smoked by each subject at any particular age, t. We construct the smoking histories in the following way. Clearly d(0) = 0 and we keep this constant until the age at start of smoking. Subjects in both cohorts report their smoking intensities at different time points in the following categories: 0–4 cig/day, 5–14 cig/day, 15–24 cig/day, 25–34 cig/day, 35–44 cig/day,  >45 cig/day. We assign the midpoint of each category as the corresponding intensity for that category, or 50 if the intensity is  >45  cig/day. Once we have done this, we compute the ages at which subjects changed their smoking intensities (change points of d(t)) and assign the corresponding intensity value to d(t). In particular, we use the smoking information reported in the initial questionnaire to calculate the change points up to the age at entry into the study. In the initial questionnaire, participants responded to questions about their past smoking habits. The information provided at entry differs between the cohorts. In the NHS, the nurses reported the age at which they started smoking if they did, their average smoking rate and the quitting age for ex-smokers. Thus, for the NHS subjects, there are at most two ages of interest before the age at entry, namely the age at start of smoking and the age at quitting. In contrast, the subjects in the HPFS reported their average smoking rate during specific age-periods before their age at entry (<15, 15–19, 20–29, 30–39, 40–49, 50–59,  >60 years). In this case, we only consider the information of the age-periods with right end point lower than their age at entry to the study. In addition, we only take the information in the  >60 age-interval if the age at entry to the HPFS is at least 70-year-old. We assume that the age at start of HPFS smokers is given by the mid-point of the first age-period with a positive smoking rate. In case their first positive exposure occurs in the  <15 or  >60 age-interval, then we use 13 or 65 years old as the starting ages, respectively. Finally, if there is a change in the smoking dose or status between two consecutive age-periods, we assume that the change occurred in the midpoint year in between them and assign the corresponding age as a change point of the smoking history. The final step is to calculate the change points after the age of entry to the studies. This is done in a simple way, since the subjects in both cohorts report their smoking status and dose every two years after their entry into the study. In particular, we compare the smoking intensities between consecutive questionnaires and if they differ, we assume that the change occurred at the beginning of the mid-point year and assign the corresponding age as a change point of the smoking history. Typical smoking histories for members of the NHS and HPFS are shown in Fig. 1.
https://static-content.springer.com/image/art%3A10.1007%2Fs10552-007-9094-5/MediaObjects/10552_2007_9094_Fig1_HTML.gif
Fig. 1

Example of smoking histories for members of the NHS and HPFS. Top panel, subject of the NHS. Bottom panel, subject of the HPFS

We use likelihood-based methods to estimate parameters of the TSCE model, which are functions of the smoking history. A brief description of the model and details of the likelihood construction are presented below.

The two-stage clonal expansion model

For our analyses, we use a multistage model that acknowledges three phases in the process of carcinogenesis. In the first phase (initiation) a susceptible stem cell acquires one or more mutations resulting in an initiated cell, which has partially escaped growth control. In the second phase (promotion) initiated cells undergo clonal expansion, either spontaneously or in response to endogenous or exogenous promoters. Promotion is an extremely efficient way to bring about malignant conversion because clonal expansion of initiated cells creates a large population of cells that have acquired some of the genetic changes required for malignant transformation. Finally, in the third phase (malignant conversion) one of the initiated cells acquires further mutational changes leading to a malignant cell. The simplest model incorporating these three phases is the TSCE model [7, 8]. A schematic representation of the model is shown in the Appendix.

The TSCE model assumes that normal stem cells become initiated according to a Poisson process with intensity μ0X, where X is the number of susceptible stem cells. Initiated cells expand clonally (promotion) via a linear birth and death process with rates (α,β). This means that each time that an initiated cell divides, it can produce two initiated cells (with birth rate α) or die/differentiate (with death rate β). Initiated cells can also divide into one initiated and one malignant cell (with rate μ1). The time between the first malignant cell and diagnosis is modeled either as a constant or gamma-distributed lag.

Each of the parameters of the TSCE model may, in principle, be affected by cigarette smoke. Recall that d(t) denotes the cigarette consumption of an individual at age t. We assume that each of the identifiable parameters of the model (see Table 2) has a dose–response given by
$$ \theta_{tob}(d(t))=\theta(1+\theta_cd(t)^{\theta_e}) $$
(1)
where θ is the background parameter, and θc and θe are the dose–response coefficients. Previous analyses using the TSCE model [9, 10] of the relationship between smoking and the lung cancer rates suggested that power laws are good models for the smoking dose–response [9]. We estimate the background rates and the cigarette dose–response coefficients for each identifiable parameter in the model.
Table 2

Parameter estimates [MLE (MCMC 95% CI)]

 

Parameter

NHS

HPFS

Fixed parameters

Stem cell population X

107

Initiated cells’ division rate α

3

Gamma-distributed lag time mean

5

Background rates

Initiation & malignant-conversion rate μ0 = μ1

8.14e-8 (5.51e-8,1.27e-7)

Initiated cells’ promotion rate g = α − β − μ1

0.0956 (0.0772, 0.1106)

Gamma-distributed lag time std

3.28*

Tobacco coefficients

Tobacco promotion rate coefficient\(^{\dag}\,\,g_{c}\)

0.1458 (0.1010,0.1752)

0.1123 (0.0802,0.1500)

Tobacco promotion rate power\(^{\dag}\,\,g_{e}\)

0.5171 (0.4703,0.5945)

Tobacco malignant-conversion coefficient μ1c

0.2095 (0.1565,0.6691)

0.5339 (0.2876,1.6972)

Tobacco malignant-conversion power μ1e

0.4684 (0.1083,0.5483)

Loglik

 

11696.40

*95% CI not calculated. Fixed at MLE value during MCMC simulation

\(^{\dag}\) Applies also to the initiated cells’ division rate

Likelihood function

The likelihood function is the product of individual likelihoods over all the subjects in the cohort(s). Each participant was lung cancer free at the beginning of the study. Hence, to calculate the likelihood, we must condition on the fact that the individuals did not have the clinical disease at their age of entry to the study (aei). Subjects are censored in case of death by any other cause or in case they survive and were never diagnosed with lung cancer until the end of follow-up (year 2000 for our analysis). In addition, we also censor any individuals who were diagnosed with other types of cancer, except non-melanoma skin-cancer. Let ali be the censoring or failure (lung cancer diagnosis) age. The individual likelihoods are
$$ L_i(al_i,ae_i;\bar{\theta}(d_i))= \left\{\begin{array}{ll} -\frac{S^\prime(al_i;\bar{\theta}(d_i))}{S(ae_i;\bar{\theta}(d_i))} & \hbox{for lung cancer cases,}\\ \frac{S(al_i;\bar{\theta}(d_i))}{S(ae_i;\bar{\theta}(d_i))} &\hbox{otherwise,}\\ \end{array}\right. $$
(2)
where \(S(t;\bar{\theta}(d_i))\) is the survival probability at age t of an individual with smoking history di, and \(\bar{\theta}(d_i)\) denotes the vector of identifiable model parameters given the smoking history di (Note: the prime denotes derivative with respect to t). The overall likelihood is then
$$ {\mathcal{L}}=\prod_{i}{L_i(al_i,ae_i;\bar{\theta}(d_i))}, $$
(3)
where the product is taken over all the subjects in the cohort(s).

The survival function

Exact expressions for the survival and hazard functions of the TSCE with piecewise constant parameters are available in the literature [11]. If we assume a constant or gamma lag time between the appearance of the first malignant cell and clinical diagnosis, the survival function required in expression (3) is given by
$$ S(t;\bar{\theta}(d_i))= \left\{\begin{array}{ll} S_2(t-t_{lag};\bar{\theta}(d_i)) & \hbox{if lag time is constant,}\\ 1-\int_0^{t}(1-S_2(u;\bar{\theta}(d_i)))f(t-u)du & \hbox{if lag time is gamma-distributed,} \end{array}\right. $$
(4)
where \(S_2(t;\bar{\theta}(d_i))\) represents the TSCE model survival and f(·) is the gamma density.

Ten-year risk predictions

Models optimized for subjects in the NHS and HPFS cohorts are used to predict a 10-year risk estimates for lung cancer incidence. Competing causes of mortality are adjusted using standard actuarial methods for multiple decrement life tables. All cause annual risk estimates are extracted from the National Center for Health Statistics [12]. We use the 1989–1991 life tables for both cohorts. About 95% confidence intervals (CI) are calculated by sampling model variables from a Markov Chain Monte Carlo (MCMC) simulation using the Metropolis-Hastings algorithm.

Ratio of age-adjusted hazards

For any particular smoking history, we use the ratio of age-adjusted hazards as a measure of the lung cancer relative risk between the NHS and HPFS. This ratio is calculated as follows. We compute the TSCE model age-specific incidence (ages 40–80) in each cohort using the corresponding maximum likelihood estimate (MLE) parameters and the specific smoking history of interest. We then adjust for age in each cohort using the 1990 US total white population and compute the ratio of age-adjusted hazards. In order to calculate a 95% CI of the estimated ratio, we obtain independent samples of the model parameters (model described in Joint Model section) via Markov Chain Monte Carlo (MCMC) simulations with the Metropolis Hasting algorithm [13]. For each set of parameters in the MCMC run, we compute the TSCE hazard in the NHS and HPFS (using the specific smoking history of interest), adjust for age in each cohort and compute the relative ratio between females and males. We then calculate the 95% CI of the ratio of age-adjusted hazards.

Estimation procedure

Estimation of the parameters is done via maximum likelihood methods. The background rates and the dose–response relationships are estimated by maximizing the likelihood for the observed cancer incidence using the piecewise constant exposures of cigarette for each individual. The likelihood function calculation and its maximization is done by High Performance Fortran routines. The Nelder–Mead simplex and the modified Davidon-Fletcher-Powell algorithms are used for the optimization. Gauss-Legendre quadratures are used for the integration required for the computation of the survival function when the gamma-distributed lag time (time from malignant transformation to diagnosis) is used.

We used two estimation procedures. In the first, we fit the background parameters to the never-smokers only and then keeping them constant, fit the model to the entire cohort to optimize the dose–response parameters. Second, we fit the model to the entire cohort and estimate all the parameters simultaneously. We find that both approaches lead to similar fits in terms of the likelihood function. However the first provides better fits to the number of cancer cases in each sub-group, so it is preferred to the later.1 All the results presented here are based on the first estimation procedure.

Results

Not all the parameters in the TSCE model are identifiable [11]. We use the specific parameterization shown in Table 2. To start, we assume that all the TSCE model parameters can be affected by cigarette exposure. Using likelihood-ratio tests, we reduce the model to describe the cohorts’ lung cancer incidence with as few parameters as possible. We find that in both cohorts, only the net cell proliferation and the malignant conversion rate have a statistically significant dose–response.2 In addition, we find that using a gamma-distributed lag time improves the model fit significantly in both cohorts.3 Table 2 shows the reduced set of parameters. The corresponding 95% CI are constructed via MCMC simulations with the Metropolis Hasting algorithm [13]. The TSCE model describes lung cancer incidence in both cohorts well, as can be seen from Fig. 2.
https://static-content.springer.com/image/art%3A10.1007%2Fs10552-007-9094-5/MediaObjects/10552_2007_9094_Fig2_HTML.gif
Fig. 2

NHS and HPFS lung cancer incidence. Solid line maximum likelihood estimate from joint fit of the NHS and HPFS, dashed lines, MCMC 95% CI. The incidence is calculated by summing individual one-year integrated-hazards over all subjects at risk. Stars show the ratio of observed lung cancer cases to person years at risk in five-year bins, with 95% confidence bars based on Poisson assumptions. Please note the different scales in the panels

Independent models

First, we fit our models to both cohorts independently. In both cohorts we find that the primary etiological mechanism for lung cancer appears to be smoking-related promotion (increased clonal expansion rate). The fitted models have a highly significant sub-linear dose–response on the promotion of premalignant lesions. These results are in agreement with a previous joint analysis of the lung cancer mortality in the British doctors’ and the American Cancer Society CPS-I and CPS-II cohorts [10]. Interestingly, the results are closer to the fits to the CPS-II cohort, which was roughly contemporaneous with NHS and HPFS, than to fits to the earlier CPS-I and British doctors’ cohorts. The NHS, HPFS and CPS-II cohort had an increased dose–response of tobacco on promotion than the earlier cohorts, but a reduced effect on initiation. These differences may in part be explained by changes in cigarette composition, with higher levels of nitrosamines in the newer cigarettes acting as promoters, while the lower tar levels may be associated with the lower apparent initiation rate. We also find a significant dose–response in the malignant conversion of premalignant lesions in the NHS and HPFS. This was not seen in CPS-II, possibly because the data did not include follow-up for changes in smoking intensity. A dose–response on malignant conversion has relatively short term effects on incidence rates.

Interestingly, all parameter estimates are similar in the NHS and HPFS cohorts, suggesting that a common model could describe the incidence in both.

Joint model

There are reports in the literature suggesting that, for a given level of smoking, women are at higher risk of lung cancer than men [13, 5, 6]. However, a recent analysis of the NHS and HPFS by Bain et al. [4] found no statistically significant gender differences in the lung cancer rates among smokers for a given level of smoking in the NHS and HPFS cohorts. In a later correction to the original publication, Bain et al. [14] reported a gender difference among ex-smokers with the risk in women being 1.5 relative to men. Wakelee et al. [15] suggested in a recent analysis of several large cohort studies, including the NHS and HPFS, that the lung cancer incidence among never smokers is higher in women. However, although their estimated age-adjusted lung cancer incidence among never smokers is slightly higher in the NHS than in the HPFS, they do not reject the equality of the never smoker lung cancer rates in the two cohorts.

In order to address the issue of gender differences, we explored a joint model in the two cohorts. Multistage models allow us to test for specific gender differences in the initiation, promotion and malignant conversion rates of lung cancer. Using likelihood-ratio tests, we cannot reject the equality of the background parameters between females and males, although we can reject the equality of all the model parameters. In particular, a model with different tobacco-induced promotion and malignant conversion coefficients between women and men is the over all preferred model. Table 2 shows the parameter estimates of the preferred joint model. All the figures in this article are obtained using the parameter estimates of the preferred joint model.

The NHS and HPFS cohorts contain information on never, current, and former smokers. Figure 2 shows the lung cancer incidence among never, former, and current smokers in both cohorts and the model predictions. The bottom panels in Fig. 2 show the number of lung cancer cases in the NHS and HPFS as a function of years since quitting.

Discussion

Methods of analyses are based on ideas of multistage carcinogenesis are fully parametric and allow complex patterns of exposure to multiple covariates to be explicitly considered [9]. In the analyses reported in this article, we have explicitly considered individual smoking histories, including age at start of smoking, changes in levels of smoking, and age at quitting among ex-smokers. Models incorporating detailed smoking information on the individual level are useful in exploring the consequences of intervention strategies to modify smoking habits. Moreover, being biologically based, multistage models allow the investigation of the effects of smoking on lung cancer initiation, promotion and malignant conversion. Hence, multistage models provide a natural framework to evaluate the potential benefits of chemo-prevention and pharmacological intervention strategies based on mode of action of the intervention. Finally, analyses based on multistage models begin with a completely different set of assumptions and therefore complement the traditional approaches. In particular, these analyses do not assume proportionality of hazards, a very strong assumption that appears to be inappropriate in the case of lung cancer and smoking [16].

Previous analyses using multistage models

The Two-stage Clonal Expansion Model has been used to describe the lung cancer incidence and mortality in several cohort and case–control studies [9, 10, 1719]. In all of them, smoking-related promotion has been found to be the primary etiological mechanism of lung carcinogenesis. Interestingly, analyses of older datasets have shown also an effect of smoking on lung cancer initiation and no effect on malignant conversion [9, 10]. However, exactly the opposite has been found in more recent dataset [18, 19]. In particular, Heidenreich et al. [18] found in a case–control study in Germany that smoking has significant effects on promotion and malignant conversion and no effects on initiation. More recently, Schollnberger et al. [19] found similar patterns in a large cohort study carried out in 10 European countries. Interestingly, Schollnberger et al. reported that a common model described lung cancer incidence in males and females in the European Prospective Investigation into Cancer and Nutrition (EPIC). They concluded that gender differences in lung cancer risk are due entirely to differences in smoking habits. Hazelton et al. [10] also found a limited effect of tobacco on the lung cancer initiation in the CPS-I study, however, no effect on malignant conversion was seen in that cohort. These differences may in part be explained by changes in cigarette composition, with higher levels of nitrosamines in the newer cigarettes acting as promoters, while the lower tar levels may be associated with the lower apparent initiation rate. Additionally, the smoking information available in the older cohorts may not have been detailed enough to detect an effect on malignant conversion.

Incidence among life-long non-smokers

Our analyses indicate that the incidence of lung cancer among life-long non-smokers is virtually identical in the two cohorts. The incidence curves predicted by our model along with observed incidence rates in both cohorts are shown in the top panels of Fig. 2 and in the left panel of Fig. 3a.
https://static-content.springer.com/image/art%3A10.1007%2Fs10552-007-9094-5/MediaObjects/10552_2007_9094_Fig3_HTML.gif
Fig. 3

Age-specific lung cancer incidence rates and relative risk of smoking and quitting. Maximum likelihood lung cancer incidence from joint model among 20 and 40 cigarette smokers in the NHS and HPFS. Smoking starts at age 20. Former smokers quit at ages 30 or 50. (a) Predicted lung cancer incidence from joint model for never and current smokers in the NHS and HPFS. Please note the different scales in the panels. (b) Relative risk among smokers (current/never smoker) and among ex-smokers (former/current smoker) for 20 and 40 cigarettes

Incidence among continuing smokers

Figure 3a shows the age-specific incidence curves generated by the joint model for female and male smokers of 20 and 40 cigarettes per day. The second panels of Fig. 2 show the age-specific incidence rates among continuing smokers in both cohorts along with the incidence curves generated by our model.

The relative hazard associated with smoking 20 and 40 cigarettes per day in each cohort is shown in the top panels of Fig. 3b. It is clear from this figure that the relative risks associated with smoking are strongly modified by duration of smoking. That this observation is not an artifact of our model can be seen from the directly computed rate ratios in the Cancer Prevention Study I (Burns et al. [20] , Table 11), which show a similar concave-down picture not only for lung cancer but also for other causes of mortality associated with cigarette smoking. The initial increase in RR with duration of smoking can be directly attributed to the strong influence of tobacco on promotion. The later decline can be attributed to the strong increase in non-smoker incidence rates of lung cancer with age with a concomitant leveling off of the incidence rates among smokers predicted by the model. The strong modification of RR by duration of smoking suggests that the proportional hazards model may not be the appropriate tool for analyses of these data.

A common model for lung cancer incidence in the NHS and HPFS (identical model parameters) is rejected by the likelihood-ratio test (see Joint Model section). The best fitting model indicates that smoking-induced promotion is somewhat higher among females, whereas smoking-induced malignant conversion is somewhat lower. As a result of these opposing effects on smoking-induced lung cancer risk, the incidence curves are rather similar as shown in the second panels of Fig. 2 and in the right panel of Fig. 3a. The evidence of a larger effect of smoking on promotion among females is consistent with a synergistic effect with estrogens [2, 6], and with effects of gastrin-releasing peptide (GRP) expression in females [21]. GRP stimulates cell proliferation in tumors [22] and appears to be expressed more frequently in female than in male non-smokers and activated earlier in women in response to tobacco exposure than in men [21]. The hazard among females relative to that among males for smokers of 20 and 40 cigarettes per day is shown in Fig. 4. This figure shows that relative risk increase gradually with duration of smoking, but the confidence bands generated by MCMC methods include 1. For smokers of 20 cigarettes per day, the ratio of age-adjusted female to male rates is 1.1 (95% CI = 0.77–1.29)4 and is not statistically significant, a finding that is consistent with that reported in Bain et al. (2004). For smokers of 40 cigarettes per day, the ratio of age-adjusted female to male rates is 1.2 (95% CI = 0.80–1.64).
https://static-content.springer.com/image/art%3A10.1007%2Fs10552-007-9094-5/MediaObjects/10552_2007_9094_Fig4_HTML.gif
Fig. 4

Women/Men hazard ratio for current and former smokers. Solid line Maximum likelihood hazard ratio from independent fits to the NHS and HPFS. Dashed lines, MCMC 95% CI. Left panels. Women/Men hazard ratio for smokers. Right panels. Women/Men hazard ratio for ex-smokers (quit at age 50). Smoking in all panels starts at age 20

Incidence among ex-smokers

The bottom panels of Fig. 2 show the incidence rate among ex-smokers as a function of time since quitting. The model predictions describe the data well in both cohorts except for the first few years after quitting. We attribute this discrepancy in the first few years to quitters who stopped smoking because they had developed symptoms of lung cancer. This phenomenon is well known [23, 24]. The effect of smoking on the rate of malignant conversion implies a rather quick decrease in risk after quitting, and the effect on the rate of promotion implies a continuing decrease in risk over a prolonged period of time as seen in previous analyses of mortality data (Hazelton et al. [10]). Bottom panels of Fig. 3b show the decrease in lung cancer incidence among ex-smokers relative to that among continuing smokers. The pattern of decrease in both cohorts is consistent with that reported for mortality by Hazelton et al. [10], by Peto et al. [25] and by Rachet et al. [16].

Figure 4 shows the female to male hazard ratio for ex-smokers. This ratio is higher than the ratio of hazards for continuing smokers (left panels of the figure). The hazard ratio quickly increases to about 1.5 and remains approximately constant. It is important to mention that these calculations also depend on the assumed age at start (age 20) and age at quitting (age 50). The confidence bounds on the ratio indicate that it is border-line significant consistent with the report by Bain et al. [14]. For ex-smokers of 20 cigarettes per day, the ratio of age-adjusted female to male rates is 1.35 (95% CI = 0.99–1.56). For ex-smokers of 40 cigarettes per day, the ratio of age-adjusted female to male rates is 1.48 (95% CI = 1.08–1.90).

The estimated benefits of smoking cessation depend largely on the available information at older ages, where longer durations of both abstinence and smoking are observed. The age-distribution of individuals differs between the two cohorts, with a larger proportion of older individuals present in the HPFS. Therefore, it is plausible that the lower risk among the ex-smokers in the HPFS predicted by the model is attributable, at least in part, to the difference in age-distribution.

Ten-year risk predictions

A 10-year risk predictions with 95% CIs are shown in Table 3 for different smoking patterns among continuing smokers and for former-smokers who quit at the beginning of the 10-year risk-projection period. These calculations may overestimate the 10-year risk of lung cancer incidence for heavy smokers, because population-based annual life tables [12] were used to adjust for competing risk (No life tables for different smoking levels were available). The calculation of risk for smokers who quit at the beginning of the 10-year interval was made by assuming that the dose–response functions return to background levels when smoking stops. These estimates show the benefit of quitting for any dose and duration of smoking. Risk estimates are somewhat higher in the NHS than in the HPFS. These estimates are consistent with 10-year risk projections based on data from the Carotene Retinol Efficacy Trial (Table 2 of Bach et al. [26]). Estimates in Table 3 are higher than 10-year projections of lung cancer mortality risk based on the CPS-I and CPS-II cohorts (Table 3 in Hazelton et al. [10]).
Table 3

The 10-year risk projections for smokers who smoke for 25, 40, or 50 years and continue to smoke or quit at ages 55, 65, or 75 years based on models for White male and female smokers in the NHS and HPFS cohorts [% risk(95% CI)]

 

25 years

40 years

50 years

Quit

Still smoking

Quit

Still smoking

Quit

Still smoking

NHS: 20-cig smokers

    55

0.8 (0.6–1.2)

1.7 (0.9–3.1)

2.0 (1.6–2.7)

3.8 (2.3–6.5)

*

*

    65

2.0 (1.5–2.7)

3.8 (2.0–6.6)

4.6 (3.7–5.7)

7.9 (4.8–12.0)

6.7 (5.6–8.0)

10.7 (7.1–15.1)

    75

4.0 (2.9–5.2)

6.8 (3.7–10.6)

7.8 (6.3–9.5)

12.0 (7.7–16.7)

10.3 (8.5–12.6)

14.9 (10.2–20.1)

NHS: 40-cig smokers

    55

1.8 (1.3–2.7)

4.2 (2.0–7.6)

5.4 (4.0–7.0)

10.4 (6.2–15.7)

*

*

    65

4.1 (2.9–5.7)

8.4 (4.5–13.8)

10.0 (8.3–12.15)

16.9 (11.3–23.0)

12.3 (9.6–15.7)

18.8 (13.2–25.6)

    75

7.3 (5.4–9.4)

12.7 (7.4–18.7)

13.7 (11.2–16.9)

20.4 (14.4–27.3)

14.9 (10.6–20.7)

20.8 (14.7–29.1)

HPFS: 20-cig smokers

    55

0.7 (5.2–1.1)

1.8 (0.8–3.1)

1.5 (1.0–2.0)

3.4 (1.6–5.4)

*

*

    65

1.7 (1.2–2.3)

3.8 (1.8–5.9)

3.2 (2.4–4.1)

6.6 (3.6–9.4)

4.3 (3.5–5.3)

8.3 (5.3–11.2)

    75

3.1 (2.3–4.2)

5.9 (3.2–8.6)

5.2 (4.2–6.4)

9.0 (6.1–12.2)

6.5 (5.4–7.8)

10.7 (7.9–14.1)

HPFS: 40-cig smokers

    55

1.4 (0.8–2.2)

3.8 (1.4–6.4)

3.4 (2.1–4.7)

7.9 (3.5–11.5)

*

*

    65

3.1 (1.9–4.2)

7.1 (3.2–10.5)

6.3 (4.5–7.8)

12.4 (7.2–16.8)

7.8 (6.1–9.6)

13.9 (10.0–18.3)

    75

5.1 (3.5–6.6)

9.6 (5.3–13.6)

8.5 (6.9–10.5)

14.2 (10.3–18.7)

9.5 (6.4–12.2)

14.7 (10.3–19.6)

Note: Life tables are used to adjust for death from competing causes. Model-based 10-year risks are shown for each smoking pattern, with MCMC 95% CIs. Projections for individuals who quit smoking and continue to abstain for the following 10 years assume the model variables revert to background values following smoking cessation. Asterisks are placed in cells that correspond to unrealistically early ages for starting smoking. These rates are not generalizable and are probably lower than expected for the general population, as the cohorts are more educated and healthy

Conclusions

We conclude that the risk of lung cancer is similar among non-smoking and smoking men in the HPFS and women in the NHS, but that the lung cancer risk among ex-smokers is higher in the NHS. Within the framework of the TSCE model, this difference can be attributed to higher smoking-related promotion in the NHS cohort. However, it is plausible that this is just an artifact produced by the difference in age-distribution between the two cohorts. In both cohorts, we find that the main effect of cigarette smoke is on the promotion of premalignant lesions. This is consistent with previous analyses of several cohort and case–control studies using the TSCE model [9, 10, 1719]. The relative risk of smoking is strongly dependent on duration of smoking. For a smoker who begins to smoke before the age of 20, the RR increases to about age 70 and declines thereafter. This pattern is consistent with that observed in other studies [10, 20]. Among ex-smokers, the relative risk of former versus current smokers appears to decrease more strongly at higher smoking levels. This finding is consistent with the analysis of CPS-I, CPS-II and the British Doctors cohorts in Hazelton et al. [10] and with the analysis of a large case–control study in Rachet et al. [16].

Footnotes
1

We also tested disjoint models for never and ever smokers. However, these models also lead to similar fits, but with a larger number of parameters.

 
2

However, a smoking effect on initiation almost doubling the background rate is still consistent with the data.

 
3

A gain of eight log-likelihood points with only one more parameter.

 
4

Age-adjusted to the 1990 US total population. Please see Materials and Methods for details.

 

Acknowledgments

We thank the Cancer Intervention and Surveillance Modeling Network (CISNET) Group, Dr. Anup Dewanji and Dr. Jihyoun Jeon for useful suggestions. We acknowledge support from the NIH grants RO1 CA047658 and UO1 CA97415. Financial support: NIH grants RO1 CA047658 and UO1 CA97415.

Supplementary material

10552_2007_9094_MOESM1_ESM.pdf (9 kb)
Two-stage clonal expansion model (10 KB)

Copyright information

© Springer Science+Business Media B.V. 2007