Background

Smoking is a major risk factor for many diseases [1], including highly prevalent conditions such as heart disease [13], diabetes and its complications [1, 314], and hypertension [15, 16]. According to the Standards of Medical Care in Diabetes, hypertension is a common comorbidity of diabetes, "In these patients, other cardiovascular risk factors including obesity, hyperlidemia, smoking ... should be carefully assessed and treated." [17]. These standards of care support smoking cessation counseling as a necessary component of comprehensive approaches to manage patients with diabetes because smoking is related to macrovascular and microvascular complications of diabetes [17]. However, neither health care providers nor individuals with diabetes are sufficiently aware of the increased risk of developing cardiovascular disease for smokers with diabetes [18, 19]. Preventive care to reduce the risk of future cardiac events includes interventions targeting type 2 diabetes, obesity and insulin resistance such as the use of medications and lifestyle changes to stop smoking, increase exercise, and incorporate diet modification to lower blood glucose levels, blood pressure, and cholesterol [17, 2024]. Health care providers' compliance with diabetes-related preventive care has been low [25].

While smoking status is routinely measured through self-report, the validity of such self-reported smoking data may be suspect because individuals may give invalid self-report of their smoking status. Carbon monoxide and nicotine/cotinine are widely used biomarkers of tobacco or tobacco smoke exposure providing objective measures of smoking data [2632]. Nearly all studies reporting smoking status collect self-reported data that are very rarely validated. For example, there were 191 original articles, clinical practice, clinical implications of basic research and health policy reports with "smoking" in the text of the New England Journal of Medicine from January 2001 through December 2005, of which 1.0% (n = 2) included cotinine-determined smoking measures.

Self-reported smoking status underestimates smoking prevalence in certain populations [3335]. Data relevant to identifying smokers who deny smoking in a meta-analysis on validity of self-reported smoking status reported an average 13% of "true" smokers reported invalid non-smoking status, with a wide variation ranging from 0% to 94% [36]. This wide variation may be related to the different study designs including distinct objective measures of smoking such as carbon monoxide, carboxyhemoglobin, thiocyanate, serum or salivary cotinine, and very different study populations.

One important question is whether the proportion of respondents who smoke but report they are not smoking is a systematic (i.e. differential) misclassification bias so as to confound the relationships between predictors and dependent variables. Smoking misclassification bias has been found with a correlation as low as 0.40 between self-reported and cotinine-determined smoking [31]. Invalid reporting of smoking may explain the effect of a main exposure (eg. beta-carotene), indicating the need to include biomarkers of smoking in epidemiologic studies [31]. In other words, if smoking is measured with error, and smoking is associated with an exposure of interest, it will be difficult to rule out smoking as the true cause of the association between the exposure of interest and the outcome, hence leading to a biased estimate.

When looking at the evidence, with respect to significant proportions of smokers who self-reported as non-smokers in population-based studies, it may be expected that the more social pressure exists against smoking in a society or the personally relevant social network of the smoker, the larger may be her/his propensity to deny smoking. It is hypothesized that people who have or have had smoking-related severe diseases such as diabetes or myocardial infarction have a strong propensity to deny smoking. Thus, there are sociodemographic characteristics such as age, race/ethnicity, gender, education and income, along with smoking-related diseases in which patients are advised not to smoke, that may be of relevance for invalid self-reported statements.

The current literature supports a recommendation that self-reported smoking status be confirmed by biochemical verification [2632, 37]. Descriptive studies found many patients deny smoking [38], even after having been informed that they would be tested for tobacco smoke exposure [39]. A research approach referred to as the "bogus pipeline" strategy, informs study participants that their self-reports can or will be verified by a monitoring device [40]. As applied to self-reported tobacco use, participants are informed that their self-reports can or will be verified by a biochemical test, then specimens are collected but not analyzed. This approach may not be relevant to the third National Health and Nutrition Examination Survey (NHANES III) participants because they were aware that their serum would be tested for many components, but there was not a focus on the cotinine assay.

Typically, studies reporting on validity of self-reported smoking for different groups have used a single covariate, such as, race [41, 42], age [43, 44], gender [44], social class [44], marital status [44], education [44]; or race-gender categories [45]. While the question of the accuracy of self-reported smoking has been previously reported, this question has not been approached through a population-based study simultaneously assessing age, race/ethnicity-gender, education, income, and smoking-related diseases as predictors of invalid self-reported non-smoking by "true" smokers. Previous studies focused on self-reported smoking as the denominator, not on "true" smokers. That is, previous studies investigated the question, among those who report they smoke or do not smoke, what proportion truly smoke or do not smoke. But, there has not been a focus on the question, among those who truly smoke, what proportion report they do not smoke.

The objectives of our study include: (1) Describe sociodemographic characteristics and smoking-related disease status for valid and invalid reporting of smoking status by "true" smokers; and (2) Assess predictors by quantifying the association between invalid self-reported non-smoking and age, race/ethnicity, gender, income, education, diabetes, myocardial infarction, and hypertension, among "true" smokers in NHANES III [46].

Methods

Study population

The third NHANES, 1988–1994 is a national examination survey of civilian noninstitutionalized individuals, representative of the US population. NHANES employs a complex, multistage, stratified, clustered sample design. NHANES includes questionnaire, laboratory assays and clinical examination measures of health outcomes and explanatory variables. The data relevant to this study are: responses to questions regarding age, race/ethnicity, gender, household income, education, history of tobacco use, diabetes, myocardial infarction, hypertension, and laboratory assay of serum cotinine. This study was reviewed by the Case Western Reserve University institutional review board and approved as exempt under 45 Code of Federal Regulations part 46.101b, #4, IRB Protocol Number 20050805. In addition, all investigators complied with the Data Use Restrictions for the NHANES III public-use data set.

Participants provided informed consent to voluntarily participate in the interview, or the interview and examination, or the interview, examination and laboratory tests [46]. We identified 7295 adults, ages 45+ years, who completed the Household Adult Questionnaire indicating they did not currently use smokeless tobacco, pipe or cigars, and completed the Mobile Examination Center (MEC) question "How many cigarettes have you smoked in the past 5 days?" and reported they did not use nicotine gum in the past 5 days. Among the 7295 participants with MEC smoking status 410 (4.3% weighted percent) did not have serum cotinine data, resulting in 6885 adults, ages 45+ years, who did not use smokeless tobacco, pipe, cigars, or nicotine gum with MEC smoking status and MEC serum cotinine data. Thus, we studied 1483 (22.0% weighted percent) cotinine-determined smokers 45+ years of age in the NHANES III data set who did not currently use smokeless tobacco, pipe, cigars, or nicotine gum. We restricted our analyses to 45+ year olds because studies investigating diabetes and myocardial infarction typically assess individuals who develop the condition at this age [515].

Definition of main outcome

The main outcome was invalid non-smoking status, defined as "true" smokers who self-reported as non-smokers during the MEC component of NHANES III. Cotinine is a biochemical measure of tobacco exposure, it is a metabolite of nicotine indicating exposure during the previous 1 to 2 days [29]. Serum cotinine was assayed using an isotope dilution, liquid chromatography, tandem mass spectrometry method with a limit of detection of 0.03 ng/ml [46]. Thus, we used this gold standard to identify "true" smokers as those adults with serum cotinine levels 15+ng/ml [29, 47]. The definition of cotinine-determined smokers was based on the finding that serum cotinine level has a bimodal distribution, with a separation between the two peaks at serum cotinine level of 10–15 ng/ml. This distinguishes active tobacco use and secondhand smoke exposure because the highest serum cotinine level in a nonsmoker exposed to second hand smoke is 10–13 ng/ml [30]. Thus, we used the more conservative cutpoint of 15 ng/ml of serum cotinine to identify cotinine-determined smokers who self-reported they currently do not smoke. Henceforth, we will refer to cotinine-determined smokers as "true" smokers.

Definition of explanatory variables

To meet the objectives of our study we assessed the role that smoking-related diseases may play in the validity of self-reported smoking status by focusing on three highly prevalent smoking-related diseases. We also assessed sociodemographic characteristics that play a role in disease status and may play a role in the amount of social pressure against smoking in the individual's social network. Thus, the potential explanatory variables for invalid self-reported non-smoking status by "true" smokers were age, race/ethnicity, gender, household income, education, diabetes, myocardial infarction, and hypertension status. Age was dichotomized as 45–64 versus 65+ years of age using categories previously used for NHANES publications [48], with six race/ethnicity-gender categories, namely Non-Hispanic White (NHW) females, Non-Hispanic Black (NHB) females, Mexican-American (MA) females, NHW males, NHB males, and MA males. The race/ethnicity category "other" was excluded due to small or zero cell counts. Household income and education were dichotomized as categories previously used in NHANES publications, such that household income was defined as less than $20,000 versus at least $20,000; and education was dichotomized as high school graduate (Yes/No) [49, 50].

To further elucidate potential explanatory variables, we assessed three highly prevalent smoking-related diseases in which patients are advised not to smoke, namely diabetes, myocardial infarction, and hypertension. While we hypothesized that people who suffer from these smoking-related diseases may have a strong propensity to deny smoking, we recognize the current clinical guidelines recommend that all patients who use tobacco be advised to quit [51], not just those who already have disease. For males, diabetes status was based on their response to whether they were ever told by a doctor that they had diabetes or sugar diabetes (Yes/No). Females were defined as having diabetes if they had been told by a doctor that they had diabetes or sugar diabetes when they were not pregnant. Women with diabetes only during pregnancy were defined as not having diabetes. Participants responding that a doctor ever told them that they had a heart attack were defined as having a history of myocardial infarction, and those responding that they were told by a doctor or other health professional that they had hypertension, on two or more different visits were defined as having hypertension.

Statistical analyses

We compared invalid self-reported non-smoking among age, race/ethnicity-gender categories, education, and household income for "true" smokers with and without diabetes, history of myocardial infarction, and hypertension. This study compared two distinct measures of smoking status, self-reported questionnaire data and laboratory assay results for serum cotinine. SAS-callable SUDAAN [52] was used for all analyses, to account for complex survey design and sample weights in NHANES III. To assess predictors of invalid self-reported non-smoking we quantified the association between invalid self-reported non-smoking and age, race/ethnicity-gender categories (reported to indicate interaction), income, education, diabetes, history of myocardial infarction, and hypertension by calculating the odds ratios (OR) and 95% confidence interval (CI) for both the unadjusted or crude OR (ORCrude), and the adjusted OR (ORAdj) using multiple logistic regression modeling, simultaneously adjusting for the potential explanatory variables.

Results

Overall descriptive summary

The descriptive summary for invalid self-reported non-smoking by "true" smokers, who do not currently use smokeless tobacco, pipe, cigars or nicotine gum is reported in Table 1, along with the ORCrude for the association between invalid self-reported non-smoking and the potential explanatory variables. The overall invalid reported non-smoking was 5.8%. Older "true" smokers 65+years old (12.6%) were almost 4 times (ORCrude = 3.88, 95% CI: 1.95–7.73) more likely to report invalid non-smoking than 45–64 year olds (3.6%). NHB smokers (8.5%) were 1 1/2 times (ORCrude = 1.53; 95% CI: 1.04–2.26) more likely to report invalid non-smoking than NHW smokers (5.8%). Smokers who did not graduate from high school (8.6%) were twice as likely (ORCrude = 2.04; 95% CI: 1.35–3.08) to report non-smoking as high school graduates (4.4%). Smokers with diabetes (15.0%) were over 3 times (ORCrude = 3.26; 95% CI: 1.38–7.72) more likely to report non-smoking than smokers without diabetes (5.1%). There was no statistically significant association between invalid self-reported non-smoking and gender, income, history of myocardial infarction, or hypertension.

Table 1 Descriptive summary and association with validity of self-reported smoking status, 45+ year old "true" smokers, United States, 1988–1994.

Invalid self-reported non-smoking by "true" smokers with and without diabetes

The age-specific descriptive summary of "true" smokers self-reporting their smoking status is reported in Table 2, along with six gender-race/ethnicity specific categories, stratified by diabetes status. The proportion of "true" smokers with diabetes reporting non-smoking was higher for 65+ year olds (19.8%) than for 45–64 year olds (12.3%). Among 45+ year old smokers with diabetes, 15.0% self-reported as non-smokers, ranging from 0.0% of MA males, to 21.5% of NHW males and 25.4% of NHB females. Among "true" smokers without diabetes the proportion self-reporting invalid non-smoking was higher for older adults, with 3.0% of 45–64 year olds compared to 11.9% of 65+ year olds. Among 45+ year old smokers without diabetes, 5.1% self-reported as non-smokers, ranging from 2.5% of MA females, to 6.8% of MA males and 9.5% of NHB females.

Table 2 Number (unweighted) and percent (weighted) of invalid self-reported non-smoking by "true" smokers by age, gender, race/ethnicity and disease status, United States, 45+ year olds, 1988–1994

Invalid self-reported non-smoking by "true" smokers with and without a history of myocardial infarction

In Table 2, the proportion of "true" smokers with a history of myocardial infarction reporting non-smoking was higher for 65+ year olds (15.6%) than for 45–64 year olds (1.9%). Among 45+ year old "true" smokers with a history of myocardial infarction, 6.4% self-reported as non-smokers, ranging from 0.0% of MA females, to 37.0% of MA males. Among "true" smokers without a history of myocardial infarction the proportion self-reporting invalid non-smoking was higher for 65+ year olds (12.4%) than for 45–64 year olds (3.7%). Among 45+ year old "true" smokers without a history of myocardial infarction, 5.8% self-reported as non-smokers, ranging from 3.3%–6.6% of MA females, MA males, NHW females, NHB males and NHW males, to 11.8% of NHB females.

Invalid self-reported non-smoking by "true" smokers with and without hypertension

In Table 2, the proportion of "true" smokers with hypertension reporting non-smoking was higher among older smokers with 11.6% of 65+ year olds compared to 5.2% of 45–64 year olds. Among 45+ year old true "smokers" with hypertension, 7.0% self-reported as non-smokers, ranging from 2.9% of MA females, to 15.6% of MA males. Among "true" smokers without hypertension the proportion self-reporting invalid non-smoking was higher among older smokers with 13.2% of 65+ year olds compared to 3.0% of 45–64 year olds. Among 45+ year old "true" smokers without hypertension, 5.3% self-reported as non-smokers, ranging from 3.2%–4.9% of MA females, NHW females, MA males, and NHB males, to 11.0% of NHB females.

Association between invalid self-reported non-smoking and diabetes, history of myocardial infarction, hypertension, and socioeconomic status

The most parsimonious model for the association between invalid self-reported non-smoking and diabetes, simultaneously adjusting for education, and race/ethnicity-gender is reported in Table 3. Among "true" smokers, those with diabetes (ORAdj = 3.15; 95% CI: 1.35–7.34) and NHB females (ORAdj = 5.12; 95% CI: 1.41–18.58) were 3 to 5 times more likely to report non-smoking than those without diabetes, and MA females, respectively. Among "true" smokers, those who did not graduate from high school (ORAdj = 2.05; 95% CI: 1.30–3.22) and NHB females (ORAdj = 1.96; 95% CI: 1.17–3.28) were twice as likely to report non-smoking as high school graduates and NHW females, respectively. When we tested a full-model simultaneously adjusting for all three smoking-related diseases, this did not change any conclusions. In the full model the point-estimate for diabetes was essentially unchanged (ORAdj = 3.13; 95% CI: 1.33–7.35) and history of myocardial infarction (P = 0.97) and hypertension (P = 0.71) were non-significant variables; therefore they were excluded in the final model.

Table 3 Association between invalid self-reported non-smoking and diabetes, education, and race/ethnicity-gender, 45+ year olds, United States, 1988–1994

Discussion

Invalid self-reported non-smoking by "true" smokers is a critical aspect for the clinical management of smokers, when estimating smoking prevalence, estimating excess morbidity and mortality associated with smoking, and when collecting smoking data in clinical trials and surveys and other observational studies. Our study found "true" smokers with diabetes were more likely to report invalid non-smoking than "true" smokers without diabetes, but having a history of myocardial infarction or hypertension did not predict invalid reporting of non-smoking. A possible explanation for these different results regarding predictors of invalid self-reported non-smoking may be related to the relatively slowly progressing chronic nature of diabetes compared to acute myocardial infarction which could be considered a more serious, major life changing event. This possibility is based on the finding that smokers who had a history of myocardial infarction were over four times more likely to quit smoking than those being diagnosed with diabetes [53], and that duration of diabetes was not associated with smoking [54]. Thus smokers who had a history of myocardial infarction may be more likely to truly quit smoking than smokers with diabetes or hypertension.

A possible explanation for our results differing from previous reports may be related to our modeling approach. We assessed invalid self-reported non-smoking by "true" smokers as the outcome, fitting diabetes, education, race/ethnicity and gender main effects along with an interaction term, whereas the other approaches were based on unadjusted analyses or multiple regression analyses of self-reported smoking without interaction terms. The incorporation of an interaction term for race/ethnicity and gender allows for the ascertainment of "true" smokers who are more likely to report non-smoking, in order to develop a targeted approach to identify smokers for smoking cessation interventions. This is important in light of the findings that health care providers and individuals with diabetes are not sufficiently aware of the increased risk of developing cardiovascular disease for smokers with diabetes [18, 19].

To the best of our knowledge our study is the first population-based report on invalid self-reported non-smoking restricted to those who do not currently use smokeless tobacco, pipe, cigars, or nicotine gum. Even though there may be some invalid responses to the use of non-cigarette tobacco products, these products were used much less frequently in NHANES III with 4990 current cigarette smokers compared to 602 current smokeless tobacco users, 296 current cigar users and 148 current pipe users. If the proportion of adults who report invalid non-tobacco use is similar for each tobacco group, cigarette users would be impacted the most because of the higher prevalence. This may explain why our results differ from previous reports. Due to this restriction, our study had limited statistical power related to the small sample size in the age, race/ethnicity-gender specific subgroup analyses, especially when the analyses were stratified by diabetes, history of myocardial infarction, and hypertension history. While the oldest (i.e., 65+ year old) "true" smokers were more likely to self-report non-smoking than 44–64 year olds, the effect of age was not assessed further due to the small sample size of younger smokers with diabetes or history of myocardial infarction who self-reported as non-smokers.

Deception may explain most of the invalid self-reports of non-smoking [55]. Social desirability related to the social unacceptability of cigarette smoking [56] is an additional possible explanation of the invalid self-reported non-smoking status by "true" smokers. That is, NHB female smokers, smokers with diabetes or who did not graduate from high school may be more prone to being influenced by social pressure to not smoke and thus provide the socially desirable non-smoking response.

In addition, other researchers have investigated potential explanations for serum cotinine levels in non-smokers, including dietary intake of food items previously reported to have measurable levels of nicotine, such as potatoes, tomatoes, eggplant, cauliflower, green peppers, iced tea, and brewed tea [29]. These food items were not significant in the regression models for adults [29].

We believe this limitation is outweighed by the following strengths of our study: 1) focusing on the most relevant patient management, public health surveillance, and study design issues of "true" smokers by incorporating an objective measure to complement self-report; 2) analyses restricted to those who did not currently use smokeless tobacco, cigars, pipes, or nicotine gum addressed the limitation of cotinine being elevated in users of snuff and chewing tobacco [36]; 3) our modeling approach included race/ethnicity-gender interaction; 4) NHANES III sampling methodology is designed to represent the US population; 5) high quality and quantity of questionnaire, examination and laboratory data collected in an unbiased manner such that the participants were unaware of our study on predictors of invalid self-reporting of non-smoking status.

"True" smokers reporting non-smoking have higher mortality rates, higher prevalence of tobacco-related cancer, and higher prevalence of a history of myocardial infarction compared to true nonsmokers [43]. Our findings indicate that the impact of invalid self-reported non-smoking on morbidity and mortality may slightly under-estimate the true impact for 45–64 year olds, with a more substantially under-estimate for 65+ year olds. In addition, the Healthy People 2010 objective 27-1a to reduce smoking prevalence [57, 58] is based on self-reported data that are generally reported to be valid [36]. However, apparently not considered was the wide variability in the proportion of smokers reporting they were not smokers ranging from 0% to 94% [36]. Our findings raise concerns regarding invalid self-reported non-smoking by specific age, race/ethnicity-gender, education or diabetes. Additional research using a similar approach with larger sample sizes, that simultaneously takes into account sociodemographic and health status characteristics is needed to investigate this study's findings further.

These findings have potentially important general implications. An objective assessment of tobacco use, either through a targeted approach by health care providers, or as the general protocol when collecting tobacco use data in a population surveillance, clinical, or epidemiologic study, identifies smokers who deny smoking. Three general implications are: 1) Identification of smokers who deny smoking provides an opportunity for health care providers to present tobacco cessation counseling to those individuals who would not otherwise receive such advice, with the objective being the improvement of the individual patient's health. 2) Collection of an objective measure of current tobacco use may provide a better estimate of the true prevalence rates in the population and/or among the specific age, race, and gender subgroups. 3) Using an objective measure of tobacco use in a clinical or epidemiologic study addresses the possible explanation that the results may be due to systematic misclassification bias of self-reported tobacco use.

One approach to address smoking misclassification in clinical or epidemiologic studies would be the development of a statistical approach, similar to those available but limited to a single covariate [59, 60]. Alternative approaches include incorporating objective smoking measures to complement self-report by excluding "true" smokers who report never, or former tobacco use, or to create and assess a separate category of "true" smokers who report never or former tobacco use. Even though objective tobacco use measures identify current "true" tobacco-users, and most smoking-related diseases are chronic requiring a smoking history, by excluding "true" smokers self-reporting non-smoking (i.e., never or former smoking) this decreases smoking misclassification bias, and may decrease bias associated with other inaccurate responses such as self-reported disease status. Further investigation of questions assessing the validity of self-reported smoking status may also be useful.

Conclusion

We found non-random or systematic bias, indicating the inability to determine the direction in which reported estimates differ from the true effects (under-estimate, over-estimate, or true estimate). Validity of self-reported non-smoking may be related to the relatively slowly progressing chronic nature of diabetes, in contrast with the acute event of myocardial infarction which could be considered a more serious, major life changing event. These data also raise questions regarding the possible role of societal desirability in the validity of self-reported non-smoking, especially among "true" smokers with diabetes, NHB females, and those who did not graduate from high school. Health care providers may consider using objective measures of tobacco use, especially among those subgroups that are more likely to deny smoking, followed by tobacco cessation counseling and intervention information. Studies collecting self-reported smoking data must address the potentially invalid reporting of non-smoking, and should include a statement about the potentially biased estimate.