Introduction

Stroke, categorized into ischemic and hemorrhagic types, is a leading cause of death and disability worldwide, as defined by the World Health Organization [1]. It results from various factors leading to cerebral vascular damage and subsequent brain tissue damage, manifesting clinically beyond 24 h or causing death. Ischemic stroke primarily involves arteriosclerotic plaque formation, thrombosis due to reduced blood flow, and microvascular disease, whereas hemorrhagic stroke is mainly linked to hypertension, cerebral aneurysm rupture, or vascular malformations [2]. Stroke’s progression is influenced by cellular and molecular mechanisms like inflammation, oxidative stress, apoptosis, and autophagy [3].

Stroke is a major health issue in the United States, with over 795,000 people affected each year. Around 610,000 of these are first-time strokes, and 185,000 are repeat occurrences [4]. It causes approximately 140,000 deaths annually, making it the fifth leading cause of death in the U.S. The total cost of stroke, including healthcare, medicine, and missed work, is about $34 billion per year. Stroke is also the leading cause of long-term disability in the country [4].

Smoking-related factors, such as monthly smoking quantity, household smoking exposure [5], and the presence of tar, nicotine, and carbon monoxide in tobacco, are significantly associated with stroke risk [6, 7]. These indicators reflect the frequency and environmental exposure to smoking and the potential harm of inhaled smoke [8]. Notably, tar, nicotine, and carbon monoxide contribute to vascular endothelium damage [9], increased blood viscosity, and reduced oxygen transport, thereby elevating stroke risk [10,11,12]. High smoking quantity, smoking within the household, and high levels of harmful tobacco components correlate with an increased risk of stroke [13, 14]. These factors are essential for assessing individual stroke risk and underscore the importance of reducing smoking exposure and controlling harmful tobacco components for stroke prevention [15].

The National Health and Nutrition Examination Survey (NHANES), a comprehensive survey representing the U.S. population’s health and nutrition status, has yet to be used to explore the relationship between smoking-related indicators and stroke risk. This study aims to address this gap by analyzing NHANES data to clarify the connection between smoking-related indicators and stroke incidence [16].

Methods

Study population

The National Health and Nutrition Examination Survey (NHANES) serves as a cross-sectional survey aimed at gathering data on the health and nutritional status of the U.S. residential population. Information was obtained via structured home interviews, physical assessments at a mobile examination center, and laboratory evaluations, utilizing a multistage probability sampling design. The NHANES study protocol received approval from the National Center for Health Statistics (NCHS) ethics committee, and written consent was obtained from all participants.

From the NHANES 2003–2018 dataset, a total of 80,312 individuals were initially identified. Participants under the age of 18 (n = 32,549) were excluded from the study. Additionally, individuals lacking stroke status information were further removed (n = 2,975). Moreover, subjects with missing data on smoking-related indicators were also excluded (n = 35,612). Consequently, 9,176 participants were included in the final analysis. The entire process of data selection is illustrated in Fig. 1. All data used in this study is publicly available (https://www.cdc.gov/nchs/nhanes/) and weighted demographically for subsequent analysis.

Fig. 1
figure 1

Flowchart of the participant selection from NHANES 2003–2018

Smoking-related indicators

The “Smokes” variable was calculated by multiplying the number of days smoked in the past 30 days by the average number of cigarettes smoked per day, based on participant responses (SMD650). Additional smoking-related variables, including “Cigarette Length” (SMD100LN), “Cigarette Filter” (SMD100FL), “Tar” (SMD100TR), “Nicotine” (SMD100NI), and “Carbon Monoxide” (SMD100CO) content, were directly obtained from questionnaire responses. The “FamilySmoking” variable was derived from questions about household members’ smoking habits (SMQFAM). This approach provides a detailed overview of participants’ smoking behavior and exposure.

Diagnosis of stroke

The diagnosis of stroke was ascertained through self-reported questionnaires (MCQ160f). Specifically, participants were queried, “Has a doctor or other health professional ever told you that you had a stroke?” The response options available to the participants were “Yes” or “No.” This method relies on participant disclosure of a medically confirmed stroke diagnosis to identify cases within the study population.

Assessment of covariates

Our study included covariates to assess their impact: age (below 60, over 60 years), gender (female, male), race (Non-Hispanic White, Non-Hispanic Black, Mexican American, Other Hispanic, Asian, Other Race), education (9-11th grade, high school graduate, some college/AA degree, college graduate, less than 9th grade), marital status (living with a partner, married, never married, widowed/divorced/separated), poverty-to-income ratio (PIR below 1.3, 1.3 to 3.5, over 3.5), body mass index (BMI categorized into normal, overweight, obese, underweight), and conditions like hypertension and hyperlipidemia. This set of covariates helps explore various factors influencing the study’s focus.

Statistical methods

This study utilized data from the National Health and Nutrition Examination Survey (NHANES) spanning from 2003 to 2018 to compare baseline characteristics of patients with and without stroke. Analyzed variables included age, gender, educational level, marital status, Body Mass Index (BMI), Personal Income Ratio (PIR), hypertension, hyperlipidemia, diabetes, and alcohol consumption. Descriptive statistical methods were employed to present continuous variables as weighted means ± standard errors, and categorical variables as weighted frequencies and percentages.

To explore the preliminary association between smoking and stroke risk, we conducted a weighted univariate logistic regression analysis. This step aimed to assess the impact of smoking behavior (as an independent variable) on the risk of stroke without considering other potential confounders.

Further, the study employed weighted multivariate logistic regression analysis to delve deeper into the relationship between smoking-related indicators and stroke risk, adjusting for potential confounders. The analysis was stratified into four models: Model 1 made no adjustments; Model 2 adjusted for age and gender; Model 3 further adjusted for educational level, marital status, BMI, and PIR based on Model 2; Model 4 additionally adjusted for hypertension, hyperlipidemia, diabetes, and alcohol consumption based on Model 3.

A subgroup analysis was also included to specifically examine the association between family smoking habits and stroke risk. This analysis considered the smoking behavior of different family members and its potential impact on an individual’s risk of stroke.

We employed “restricted cubic splines” to model non-linear relationships between smoking-related variables and stroke risk. This method allows us to explore complex patterns in the data without assuming a straight-line relationship. For our analyses, we used the rcs() function from the rms package in R [17]. This package is specifically designed for such regression modeling, helping us accurately represent the relationships while avoiding overfitting.

All data analyses were conducted within the R software environment (version 4.1.3, available at http://www.R-project.org). Specific R packages were utilized for handling weighted data, and the aforementioned statistical methods were applied to assess the relationship between smoking behavior and the prevalence of stroke. All statistical tests were two-sided, with P values < 0.05 considered statistically significant.

Results

Baseline characteristics of study participants

In our study, we analyzed 9,176 participants to quantify the impact of smoking on stroke prevalence. The overall stroke prevalence was 3.4%, with 97% of participants not experiencing a stroke and 3.4% having a stroke. The clinical attributes of the participants, segregated based on the occurrence or absence of stroke, are delineated in Table 1. This delineation reveals that there exist significant statistical disparities between the two cohorts concerning variables such as “Age,” “Gender,” “Education,” “Marital Status,” “Poverty Income Ratio (PIR),” “Body Mass Index (BMI),” “Hypertension,” “Hyperlipidemia,” “Diabetes,” “Alcohol Consumption,” “Exposure to Family Smoking,” “Smoking Status,” “Duration of Cigarette Consumption,” “Tar,” “Nicotine,” and “Carbon Monoxide” levels, with p-values less than 0.05 indicating statistical significance. Conversely, no significant statistical variance was observed in the categories of “Race” and “Cigarette Filter.”

Table 1 Baseline characteristics of patients with or without stroke according to NHANES 2003–2018

Weighted univariate logistic analysis of stroke

Drawing from Table 2, it is evident that stroke risk is significantly augmented (OR > 1, p < 0.05) among individuals aged over 60, males, and those with pre-existing conditions such as hypertension, hyperlipidemia, and diabetes. Specifically, smoking-related characteristics—cigarette length (“Long” and “Ultra long”), nicotine, and carbon monoxide levels—demonstrate a marked increase in stroke risk. Conversely, higher levels of education, marital status (never married), and a higher personal income ratio (PIR > 3.5) are associated with a decreased risk of stroke (OR < 1, p < 0.05), indicating the multifaceted nature of stroke risk factors.

Table 2 Weighted univariate logistic analysis of stroke

Relationship between stroke and smoking-related indicators

The detailed analysis through four distinct models provides a nuanced view of the impact smoking-related indicators have on stroke risk (Table 3; Fig. 2). It highlights a clear correlation where family smoking increases stroke risk with an OR ranging from 1.88 in Model 1 to 1.65 in Model 4, suggesting a significant but decreasing impact across models. Nicotine’s role is particularly pronounced, with its OR starting at 2.39 in Model 1 and escalating to 2.64 in Model 4, indicating a consistently high risk. Tar and carbon monoxide also show substantial associations with stroke, with ORs for tar at 1.07 across all models and for carbon monoxide, increasing slightly from 1.10 in Model 1 to 1.11 in Model 4. These findings underline the critical importance of addressing smoking-related factors in stroke prevention strategies.

Table 3 Weighted multivariate logistic analysis smoking-related indicators and stroke
Fig. 2
figure 2

Weighted multivariate logistic analysis of smoking-related indicators and stroke forest map

Subgroup analysis for the association between family smoking and stroke

Subgroup analysis elucidates the association between family smoking and stroke prevalence (Table 4; Fig. 3), revealing nuanced differences across various demographics. Age-wise, individuals below 60 show a stronger association (OR = 2.04) compared to those over 60 (OR = 1.57). Gender differences are minimal, with females (OR = 1.83) and males (OR = 1.78) nearly equally affected. Racial disparities are highlighted, especially among Mexican Americans (OR = 2.95) and Other Hispanics (OR = 4.35), indicating a higher susceptibility. Education level presents a gradient effect, with some college/AA degree holders showing an OR of 2.33. Marital status impacts risk differently, with those living with a partner at a markedly higher risk (OR = 3.40). Obesity emerges as a significant risk enhancer (OR = 2.95), paralleling findings in diabetes (OR = 2.12) and hyperlipidemia (OR = 2.42). These results underscore the complex interplay between family smoking and stroke risk, necessitating targeted prevention strategies across diverse population segments.

Table 4 Subgroup analysis for the association between family smoking and stroke
Fig. 3
figure 3

Subgroup analysis for the association between family smoking and stroke forest map

The non-linear relationship between stroke and smoking-related indicators

Using restricted cubic splines with two groups divided by “FamilySmoking,” we found a nonlinear relationship between smoking indicators and stroke risk (p < 0.001). Threshold effects were identified: “Smokes” at 410 (Fig. 4A), “Tar” at 12 (Fig. 4B) , “Nicotine” at 1.1 (Fig. 4C) , and “Carbon Monoxide” at 12 (Fig. 4D) . Below these thresholds, stroke risk is stable or decreases; above them, risk sharply increases. Additionally, people without family smoking history have a lower stroke risk than those with such a history.

Fig. 4
figure 4

Restricted cubic spline fitting for the association between Smoking-related indicators with Stroke. The non-adjusted relationship between Smokes and Stroke (A). The non-adjusted relationship between Tar and Stroke (B). The non-adjusted relationship between Nicotine and Stroke (C). The non-adjusted relationship between Carbon. Monoxide and Stroke (D)

Discussion

Our study analyzed data from 9,176 participants, revealing significant correlations between smoking behaviors and related indicators (including monthly smoking amount, household smoking exposure, levels of tar, nicotine, and carbon monoxide) and the risk of stroke. Notably, levels of nicotine and carbon monoxide were closely linked to an increased risk of stroke, with household smoking exposure identified as an independent risk factor. Furthermore, through stratified and nonlinear analyses, we unveiled how age, gender, race, and other socioeconomic factors influence this relationship, identifying high-risk groups and suggesting potential preventative measures.

Nicotine, the primary active component in tobacco, affects the human body in multiple ways, especially the cardiovascular system [18,19,20]. It activates neuronal nicotinic acetylcholine receptors (nAChRs), triggering a series of biological responses including the release of adrenaline and noradrenaline [21]. This increase in neurotransmitters accelerates heart rate and elevates blood pressure, burdening the cardiovascular system and thereby elevating stroke risk. Additionally, nicotine can increase blood viscosity, enhancing the likelihood of thrombosis through promoted platelet aggregation and fibrin formation [22,23,24,25].

Carbon monoxide (CO), another harmful substance produced during smoking, combines with hemoglobin to form carboxyhemoglobin, reducing the blood’s oxygen-carrying capacity [26]. This results in decreased oxygen delivery to tissues and organs [27, 28], especially the brain, increasing the risk of hypoxic damage [29, 30]. Long-term exposure to high levels of CO can also cause chronic endothelial damage, promoting the progression of atherosclerosis [31, 32].

Tar, a complex mixture generated by tobacco combustion, contains thousands of chemicals, many of which are harmful to the human body, particularly the cardiovascular system [33, 34]. Various chemicals in tar, such as polycyclic aromatic hydrocarbons (PAHs), free radicals, and heavy metals, have direct and indirect impacts on stroke risk [35]. Harmful chemicals in tar can directly damage vascular endothelial cells, leading to endothelial dysfunction [36]. The damage caused by tar’s harmful substances reduces the production of nitric oxide (NO), an important vasodilator, disrupting vascular regulation and increasing the risk of thrombosis and, consequently, stroke. Tar’s damage to the endothelium facilitates inflammatory responses and lipid deposition, accelerating the atherosclerosis process [37]. Additionally, chemicals in tar, especially PAHs and free radicals, elevate oxidative stress levels within the body [38, 39].

Our study shows a strong link between smoking factors such as high nicotine and carbon monoxide levels and increased stroke risk. This suggests more research is needed on the biological and genetic factors that enhance smoking’s effect on stroke risk. We recommend including smoking-related measures in stroke risk assessments to improve prediction accuracy and guide prevention efforts. Healthcare providers can use this data to create specific strategies to lower smoking rates and reduce stroke occurrences, which may improve public health. Our findings, derived from a large and diverse cohort of 9,176 NHANES participants, suggest a strong relationship between smoking and increased stroke risk. However, the applicability of these results to broader populations may be influenced by regional and demographic variations in smoking behaviors and stroke prevalence. While the observed patterns across different demographic subgroups reinforce the robustness of our results, further studies are needed to validate these findings across various global populations to enhance the universality of our conclusions.

Our research boasts several strengths. It is the first study using the NHANES database to explore the relationship between smoking-related indicators and stroke risk on a large scale. Leveraging the comprehensive sample and complex sampling design of the NHANES database, we conducted in-depth analysis using weighted logistic regression models, adjusting for multiple covariates to enhance the accuracy and reliability of our findings. Furthermore, we employed restricted cubic splines and smooth curve fitting techniques to explore their nonlinear relationship with stroke risk, revealing potential inflection points and providing new insights into the complex link between smoking behavior and stroke risk.

Despite offering significant insights into the impact of smoking behavior on stroke risk, our study acknowledges several limitations. Firstly, the reliance on self-reported data from the NHANES database, including smoking habits and household smoking exposure, may introduce reporting bias, potentially affecting the accuracy of our assessment of the relationship between smoking and stroke. Secondly, the cross-sectional nature of our study means we can only observe the association between smoking behavior and stroke risk at a specific point in time, not establish causality. This limitation is particularly relevant given the potential for reverse causation, where individuals who have already experienced a stroke may change their smoking behavior. Lastly, our study’s focus on individuals aged 18 years and older may not fully capture the risks associated with smoking in older age groups, suggesting a need for future research to specifically examine the impact of smoking on stroke risk in elderly populations.

Conclusion

This study has definitively established a significant positive correlation between smoking and the risk of stroke, with particular emphasis on the substantial impact of family smoking exposure, nicotine, and carbon monoxide levels on stroke risk. Through detailed analysis, we have unveiled the nonlinear relationship and threshold effects between smoking-related indicators and stroke risk, underscoring the importance of considering smoking behaviors in stroke prevention strategies. Our findings advocate for the utilization of smoking-related indicators as effective tools in predicting stroke risk, especially in the development of personalized prevention measures and early interventions. We hope that these insights will facilitate more precise stroke risk assessment and the formulation of prevention strategies, particularly targeting high-risk groups.