Prediction and causal inference of cardiovascular and cerebrovascular diseases based on lifestyle questionnaires

Nambo, Riku; Karashima, Shigehiro; Mizoguchi, Ren; Konishi, Seigo; Hashimoto, Atsushi; Aono, Daisuke; Kometani, Mitsuhiro; Furukawa, Kenji; Yoneda, Takashi; Imamura, Kousuke; Nambo, Hidetaka

doi:10.1038/s41598-024-61047-w

Prediction and causal inference of cardiovascular and cerebrovascular diseases based on lifestyle questionnaires

Article
Open access
Published: 07 May 2024

Volume 14, article number 10492, (2024)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Prediction and causal inference of cardiovascular and cerebrovascular diseases based on lifestyle questionnaires

Download PDF

Riku Nambo¹,
Shigehiro Karashima²,
Ren Mizoguchi³,
Seigo Konishi³,
Atsushi Hashimoto³,
Daisuke Aono³,
Mitsuhiro Kometani³,
Kenji Furukawa⁴,
Takashi Yoneda³,
Kousuke Imamura⁵ &
…
Hidetaka Nambo⁶

895 Accesses
Explore all metrics

Abstract

Cardiovascular and cerebrovascular diseases (CCVD) are prominent mortality causes in Japan, necessitating effective preventative measures, early diagnosis, and treatment to mitigate their impact. A diagnostic model was developed to identify patients with ischemic heart disease (IHD), stroke, or both, using specific health examination data. Lifestyle habits affecting CCVD development were analyzed using five causal inference methods. This study included 473,734 patients aged ≥ 40 years who underwent specific health examinations in Kanazawa, Japan between 2009 and 2018 to collect data on basic physical information, lifestyle habits, and laboratory parameters such as diabetes, lipid metabolism, renal function, and liver function. Four machine learning algorithms were used: Random Forest, Logistic regression, Light Gradient Boosting Machine, and eXtreme-Gradient-Boosting (XGBoost). The XGBoost model exhibited superior area under the curve (AUC), with mean values of 0.770 (± 0.003), 0.758 (± 0.003), and 0.845 (± 0.005) for stroke, IHD, and CCVD, respectively. The results of the five causal inference analyses were summarized, and lifestyle behavior changes were observed after the onset of CCVD. A causal relationship from ‘reduced mastication’ to ‘weight gain’ was found for all causal species theory methods. This prediction algorithm can screen for asymptomatic myocardial ischemia and stroke. By selecting high-risk patients suspected of having CCVD, resources can be used more efficiently for secondary testing.

Machine learning identifies prominent factors associated with cardiovascular disease: findings from two million adults in the Kashgar Prospective Cohort Study (KPCS)

Article Open access 06 December 2022

Using machine learning-based algorithms to construct cardiovascular risk prediction models for Taiwanese adults based on traditional and novel risk factors

Article Open access 22 July 2024

Machine learning and atherosclerotic cardiovascular disease risk prediction in a multi-ethnic population

Article Open access 23 September 2020

Find the latest articles, discoveries, and news in related topics.

Artificial Intelligence

Introduction

Cardiovascular and cerebrovascular diseases (CCVD) are major causes of death in the Japanese population^1,2. These diseases are classified as ‘symptomatic’ or ‘asymptomatic,’ according to the presence or absence of symptoms. Prevention, early diagnosis, and treatment are important to prevent poor quality of life (QoL) and death^2,3,4. In some cases, such as stroke, early treatment may lead to the recovery of neurological functions. Lifestyle modification is an important component of such prevention. Appropriate diet, moderate exercise, and good sleep management can reduce the risk of CCVD. Health check-ups are also crucial. In Japan, specific health checkups focusing on metabolic syndrome have been conducted since 2008 to prevent lifestyle-related diseases⁵. Based on the results of these checkups, specific health guidelines are provided to review lifestyle habits pertaining to exercise, diet, and smoking.

In recent years, artificial intelligence (AI) has attracted significant attention because of its use in preventive medicine. AI methods can automatically identify important patterns in an individual’s clinical data and predict disease onset and prognosis^6,7. Several studies have demonstrated the high accuracy of AI models for stroke and coronary artery disease^8,9. Investigating causal inference in observational studies is also an effective method, as it captures real-world events and behaviors and may provide results that resemble real-world situations compared to experimental studies^10,11.

Several predictive models for CCVD have been proposed, but reports are limited. In addition, there are no reports examining causal relationships among lifestyle factors used for prediction. Therefore, this study aimed to examine how specific lifestyle habits affect the CCVD risk by testing multiple causal relationships. In this study, we developed a predictive model for CCVD using data obtained from metabolic syndrome screening in the general population.

Results

Participant characteristics

A total of 473,734 participants who underwent medical examinations were included in the KMA database. 261,645 individuals were excluded from the dataset because they had at least one missing value. The exclusion criteria were applied, and the remaining patients were classified into stroke (n = 10,713), IHD (n = 20,922), CCVD (n = 3868), and normal (n = 176,586) groups. Table 1 lists the baseline characteristics of each disease and normal group. The IHD, stroke, and CCVD groups showed significant differences in age, female sex ratio, BMI, prevalence of HT, prevalence of DM, prevalence of DL, SBP, DBP, HbA1c, TG, HDL-C, and T-Cho compared to the normal group. Table 2 lists the lifestyle behaviors of the participants based on the questionnaire. There were significant differences in all lifestyle behaviors in the IHD, stroke, and CCVD groups compared with those in the normal group (P < 0.05, vs. normal group).

Table 1 Clinical characteristics of participants who underwent community screening in Kanazawa city.

Full size table

Table 2 Lifestyle behaviors of participants.

Full size table

Feature importance ranking

Supplementary Figure 1 shows the feature importance ranking for each prediction model using Dataset 2. eGFR, PG, and TG were consistently among the top 10 important features in all four IHD prediction models. MCV and TG were consistently among the top 10 important features in all four stroke prediction models. ALT, Hb, and TG were consistently among the top 10 important features in all four CCVD prediction models. TG consistently ranked in the top 10 important features in all 12 models.

Supplementary Figure 2 depicts the variation in the AUC as features were incrementally added to each prediction model, starting with the highest-ranked feature. For the LGBM, 19 features were chosen for stroke, 17 for IHD, and 20 for CCVD. For the RF model, 19, 15, and 20 features were selected for the stroke, IHD, and CCVD, respectively. In the XGBoost model, 19, 16, and 19 features were selected for the stroke, IHD, and CCVD, respectively. Finally, in the LR model, stroke incorporated four features, IHD incorporated 18 features, and CCVD incorporated two.

Predictive performance

Figure 1 shows the performance metrics of the CCVD predictive models. For Dataset 1, the RF models consistently achieved the highest AUCs, with mean values of 0.729 (SD 0.003), 0.716 (SD 0.003), and 0.809 (SD 0.005) for stroke, IHD, and CCVD, respectively.

In comparison, when Dataset 2 was analyzed, the XGBoost model exhibited the best AUC. The mean AUCs were 0.770 (SD = 0.003), 0.758 (SD = 0.003), and 0.845 (SD = 0.005) for stroke, IHD, and CCVD, respectively. The prediction model employing Dataset 2 surpassed that employing Dataset 1 for all the accuracy metrics.

Causal inference

Supplementary Figure 3 illustrates the causal network and inferred directions of causality determined using the five causal search methodologies. Supplementary Table 1 shows the score rankings for the lifestyle-related features. First place was “walking speed” with a score of 142. Second place went to “chewing” with a score of 100, and third was “weight gain” with a score of 85. In fourth place were “sleep habits” and “regular exercise” with the same score of 84. The top five rankings were selected as the features for causal inference. The Direct LiNGAM technique visually represents the magnitude of influence of one variable on another using a partial regression coefficient indicated by a connecting arrow. For the three models utilizing NOTEARS, causal arrow strength is denoted as ‘Edge Weight’. However, Bayesian networks do not quantitatively illustrate causality through metrics, such as partial regression coefficients or edge weights.

Figure 2 presents the ensemble analysis results and summarizes the findings of the five causal inference methods. Relationships identified as causal by three or more methods are indicated by arrows and are differentiated using dashed, solid, or bold lines. Importantly, the causal link from DM to DL and the pathway from ‘chewing’ to ‘weight gain’ were consistently supported across all five techniques.

Discussion

A predictive model including lifestyle questionnaires was developed to diagnose CCVD. There are several reports on cardiovascular disease and stroke prediction models that use lifestyle information^12,13,14. Li et al. developed several models to predict cardiovascular disease using a dataset including lifestyle-related items such as smoking, alcohol consumption, dietary patterns, and physical activity for 1,887,710 adults in northwest China and found that the RF prediction model had the best prediction accuracy with an AUC of 0.723¹². Zhu et al. found the best prediction accuracy for stroke, with an AUC of 0.686, in a dataset of 2147 members of the general population using a model that included sex, age, lifestyle habits, genetic factors, medical history, nasal examination results, and blood sampling results¹⁴. However, these studies did not examine causality between the features used for prediction.

The precise predictive results of the algorithm are expected to alert individuals and trigger behavioral changes that lead to further guidance and closer examination. Conversely, the algorithm can reduce the financial burden and time wasted by patients with a low predictive risk who would otherwise undergo secondary testing. However, caution should be exercised when implementing this algorithm, as the model does not predict the future development of the disease, but rather indicates the possibility of prior myocardial ischemia or stroke occurrence in health-screening participants who are unaware of their symptoms. Asymptomatic stroke does not present with symptoms such as paralysis or sensory disturbance¹⁵, whereas asymptomatic myocardial ischemia is characterized by few or no typical symptoms of IHD, such as chest pain, pressure, nausea, and shortness of breath¹⁶. In practice, these asymptomatic conditions are detected incidentally using electrocardiography, computed tomography, and magnetic resonance imaging (MRI). Moreover, asymptomatic stroke increases the risk of symptomatic stroke and vascular cognitive impairment^17,18,19. Asymptomatic myocardial ischemia is more common in the elderly^20,21 and patients with DM^22,23,24,25, and is associated with increased cardiac and all-cause mortality, which is particularly important in the presence of other coronary disease risk factors^26,27,28,29. A diagnosis of asymptomatic stroke or myocardial ischemia does not imply the need for aggressive invasive treatment. However, intensive treatment of lifestyle-related diseases should be considered^2,3. Choosing patients with a high likelihood of asymptomatic disease for treatment is reasonable to prevent the future development of symptomatic CCVD and may be an economically advantageous approach owing to optimized target selection and prevention of unnecessary treatment costs. Further research on the practical efficiency and health economics of this algorithm is required.

The results of the causal inference, a novel element of this study, are surprising. We expected a causal relationship between worsening lifestyle habits and cerebrovascular diseases via lifestyle-related diseases, similar to the “metabolic domino” proposed by Ito et al.³⁰. However, in the present study, behavior was observed to change after the onset of CVD. This result may be attributed to the change in participants’ health awareness after the onset of CCVD and the effectiveness of the National Health Guidance and Health Guidance System. When participants develop CCVD and become aware of its symptoms for the first time, they may become more conscious of the changes they need to make to prevent further exacerbation of the disease and try to improve their lifestyle. It is also not yet clear whether this system is an effective mechanism for encouraging participants to change their behavior for the primary prevention of CCVD.

Fukuma et al. reported no evidence that Japanese government-led national health and health guidance interventions were associated with improvements in cardiovascular risk factors among Japanese working-age men³¹. They evaluated changes in obesity status and cardiovascular risk factors (blood pressure, hemoglobin A1c levels, HDL cholesterol levels) 1 to 4 years after screening, but not stroke, IHD, or CCVD. However, if interventions do not improve cardiovascular risk factors, they will not prevent the development of CCVD either. General health screening programs in other countries have also been reported to be ineffective in reducing mortality from cardiovascular disease^32,33. Inter9914 in Denmark reported that the incidence of IHD, stroke, and total mortality after 10 years was not significantly different between the control groups, even with interventions such as health checkups, lifestyle guidance for five years, and, if necessary, referral to a medical institution³³. Ensemble causal analysis suggests that the primary prevention system for metabolic syndrome screening and specific health guidance in Japan has not fully achieved its objectives, such as preventing CCVD.

The ensemble causal network revealed significant relationships such as: (i) The causal relationship “the lower the chewing ability, the more weight gain” was found in all five causal inference models. Several studies have suggested that chewing slowly and often during meals is associated with a lower BMI^34,35,36. Chewing well is an effective way to reduce the rate of eating and may contribute to a lower risk of obesity^37,38. (ii) Sleep habits are frequently associated with diseases and other lifestyle habits in a causal manner. Short sleep duration and sleep disturbances are associated with adverse cardiometabolic risks such as obesity, HT, type 2 DM, and cardiovascular disease^39,40. IHD is complicated by heart failure (HF). Approximately 75% of patients with heart failure experience sleep disturbance⁴¹. Obstructive sleep apnea, an obvious risk factor for heart disease, is associated with poor sleep quality, HT, and DL⁴². Thus, sleep may have multifaceted effects on the development of CCVD.

Finally, the observed variability in the causal inference results can be attributed to the occurrence of systematic biases due to confounding factors, such as selection and measurement⁴³. Causal inference may be more susceptible to bias due to potential confounding factors when the sample size is small, resulting in variable causal relationships. Shimizu et al. reported a false causal association when analyzing a 1300-sample, six-feature dataset with potential confounding factors using Direct LiNGAM⁴⁴. However, when the sample size of the dataset is large, the causal direction converges to the right in Direct LiNGAM^44,45. There were 200,000 cases in this dataset, which may not have had a significant effect on causal direction. Additionally, a selection bias may have occurred in the population. Nakao et al. reported that those who participated in the health guidance intervention improved more dramatically in both weight and cardiovascular risk factors than those who did not participate⁴⁶. Predictive models and causal relationships may differ significantly depending on the participation rate of health guidance interventions. The rate of specific health guidance provided in Kanazawa City ranged from 21.9 to 35.8%, while the national average implementation rate of specific health guidance was low, ranging from 7.74 to 23.2%⁴⁷. These regional differences may affect the predictive accuracy and causality of the developed algorithms. Finally, lifestyle and medical history information relied on the participant’s responses to the questionnaire. Measurement bias may have existed if the participants misidentified and responded to their health information. Therefore, it is impossible to eliminate all biases when using real-world data. Therefore, we analyzed and integrated the results using five different methods to ensure their robustness in the presence of these variations. None of the models showed any variations in the results, indicating that CCVD affected lifestyle-related diseases or specific lifestyle habits. The ensemble causal network method was more reliable than the single-model causal inference method.

In conclusion, we established a predictive model for CCVD using data from lifestyle questionnaires, physical observations, medical histories, and general laboratory results obtained during health examinations. Using real-world data, we used an ensemble causal network to represent the causal relationships among lifestyle, lifestyle-related diseases, and CCVD. The algorithm can predict whether health checkup participants experience asymptomatic myocardial ischemia and stroke, which may lead to savings in healthcare costs through the efficient use of healthcare resources for secondary health checkups. However, the results of this causal inference should be interpreted with caution. The results should be analyzed in other populations, and the reproducibility of the causal relationships should be confirmed. Further research is required to determine the usefulness of this algorithm in the Japanese health checkup system.

Materials and methods

Study participants

This study design is a secondary data analysis using community health screening. The study included 473,734 participants aged 40 years or older who underwent community health screening in Kanazawa City between 2009 and 2018. Medical institutions in charge of health checkups were sent identical manuals in accordance with the guidelines of the respective associations and checkups were conducted accordingly. During the checkups, clinicians performed a standard consultation and recorded data on height, weight, waist circumference, blood pressure, biochemical test results, urinalysis, and lifestyle questionnaires⁶. The study was approved by the Ethics Committee of the Kanazawa Medical Association (KMA) (No. 16000003) and the Ethics Committee of Kanazawa University (No. 2019-080) and was conducted in accordance with the Declaration of Helsinki and ethical guidelines for human medical research. All data were anonymized. The Ethics Committee waived the need for informed consent, as this was secondary data use. An opt-out notification form regarding the study was provided on the KMA website (http://www.kma.jp/kenkyu/kenkyu_index.html).

Features

The KMA database contains information on various clinical parameters as previously reported⁶. Six variables were collected from the dataset: age, sex, body mass index (BMI), waist circumference, systolic blood pressure (SBP), and diastolic blood pressure (DBP).

Laboratory blood parameters were measured within 24 h of collection using an automated clinical chemistry analyzer. These parameters included plasma glucose (PG), hemoglobin A1c (HbA1c), total cholesterol (T-Cho), triglycerides (TG), low-density lipoprotein cholesterol, high-density lipoprotein cholesterol (HDL-C), serum creatinine, estimated glomerular filtration rate (eGFR), serum uric acid, aspartate aminotransferase, alanine aminotransferase, gamma-glutamyl transpeptidase, white blood cell count, red blood cell count, hemoglobin, hematocrit, mean corpuscular volume (MCV), mean corpuscular hemoglobin, mean corpuscular hemoglobin concentration, and platelet count. The test procedures followed the specimen testing methods recommended by the Japanese Society for Clinical Chemistry.

Standardized questionnaire items for specific health checkups were developed by experts under the initiative of the Japanese Ministry of Health, Labor and Welfare. The questionnaire covered medical history including treatment for hypertension (HT), diabetes (DM), and dyslipidemia (DL). We also inquired whether the patient had previously experienced stroke, ischemic heart disease (IHD), chronic kidney disease (CKD), or anemia. The questionnaire comprised 13 lifestyle-related questions. The details of the questionnaire are presented in Supplementary Table 2.

Dataset construction

Two datasets were assembled to develop a predictive model based on questionnaire responses. Dataset 1 comprised data from 13 lifestyle-related questionnaire items in addition to age, sex, BMI, waist circumference, SBP, and DBP. Dataset 2 consisted of blood and urine test results from Dataset 1. The definitions of predictive brain disease, IHD, and complications associated with both brain diseases and IHD were derived from a questionnaire.

Statistical analysis for clinical background

Data are expressed as mean (SD) or percentage. The clinical background of each disease group was compared with that of the normal group. Normality was assessed using the Shapiro–Wilk test. Normally distributed data with equal variances were compared using the Student’s t-test, whereas data with unequal variances were compared using Welch’s t-test. P < 0.05 was considered significant. Statistical analyses were performed using Python 3.8.3 programming language (Python Software Foundation, Wilmington, DE, USA), and SciPy 1.5.2.

General process of prediction model construction

The procedure for building and predicting the CCVD model is illustrated in Fig. 3. Patients with missing data were excluded. The data set was assembled using a stratified extraction method to split the data in a 7:3 ratio for training and testing while maintaining the distribution of it.

The machine learning algorithms used were Light Gradient Boosting Machine (LGBM)⁴⁸, Random Forest (RF)⁴⁹, Logistic regression (LR)⁵⁰, and eXtreme-Gradient-Boosting (XGBoost)⁵¹. Hyperparameter tuning and feature selection by permutation importance were used for each model to optimize the performance of the machine learning models. Parameter tuning is the process of adjusting external configuration values that control the model’s learning process to maximize the model’s predictive performance. Permutation importance is a method to evaluate the importance of a feature. We randomly swapped the values of certain features, evaluated how they affected the performance of the model, and ranked the features in order of importance. Based on the ranked list of features, the minimum set of features needed to maintain or improve the model’s performance is selected. To address the imbalance in the number of cases in the CCVD and healthy groups, a Balanced Bagging Classifier was used. The Balanced Bagging Classifier is a method that combines bagging and under sampling, and used LGBM, RF, LR, or XGBoost as the base estimator. Youden Index was used to determine the cutoff value for the receiver operating characteristic (ROC) curve. The model-construction process was repeated 20 times. Predictive metrics, such as accuracy, area under the ROC curve (AUC), sensitivity, and specificity, were computed and averaged. For models utilizing Dataset 2, we gauged the significance of the clinical tests via permutation importance⁵² and selected the features exhibiting the highest AUC⁵³.

Construction and integration of causal inference

Causal inferences were drawn using the Direct Linear Non-Gaussian Acyclic Model (LiNGAM)⁴⁴, Non-combinatorial Optimization via Trace Exponential and Augmented Lagrangian for structure learning (NOTEARS)⁵⁴, NOTEARS with the least absolute shrinkage and selection operator⁵⁴, NOTEARS with PyTorch⁵⁴, and a Bayesian network^55,56. For causal inference, the variables incorporated were CCVD, stroke, IHD, DM, HT, DL, and lifestyle factors. Features were selected based on their importance, and the top five features were identified using permutation importance. The 13 lifestyle items were selected as follows:

1.
Determining feature importance using permutation importance.
2.
The questionnaire items were ranked from 1st to 13th position and scored in descending order: 13 points for 1st place, 12 points for 2nd place, 2 points for 12th place, and 1 point for 13th place.
3.
Scoring was repeated 12 times across the four prediction models and three diseases to achieve cumulative scores. Cumulative scores determined the top five lifestyle items incorporated into the causal inference.

All features used for causal inference were standardized and used in the analysis.

Data availability

De-identified participant data were shared. Please contact Shigehiro Karashima at skarashima@staff.kanazawa-u.ac.jp. The data will be restricted by the Kanazawa University IRB depending on the intended use. The data were shared in an Excel file after approval by the Kanazawa University Institutional Review Board.

Code availability

The code used for the analysis was shared at the following link. https://zenodo.org/records/10703345.

References

Ministry of Health, Labour and Welfare, Japan. The 2021 Vital Statistics. https://www.mhlw.go.jp/english/database/db-hw/vs01.html (2019).
Miyamoto, S. et al. Japan stroke society guideline 2021 for the treatment of stroke. Int. J. Stroke 17, 1039–1049 (2022).
Article PubMed PubMed Central Google Scholar
Kimura, K. et al. JCS 2018 guideline on diagnosis and treatment of acute coronary syndrome. Circ. J. 83, 1085–1196 (2019).
Article CAS PubMed Google Scholar
Maron, D. J. et al. Initial invasive or conservative strategy for stable coronary disease. N. Engl. J. Med. 382, 1395–1407 (2020).
Article PubMed PubMed Central Google Scholar
The Ministry of Health, Labour and Welfare. About Specific Health Screening and Specific Health Guidance. https://www.mhlw.go.jp/stf/seisakunitsuite/bunya/0000161103.html.
Kawakami, M. et al. Explainable machine learning for atrial fibrillation in the general population using a generalized additive. Cross-Sectional study. Circ. Rep. 4, 73–82 (2021).
Article PubMed PubMed Central Google Scholar
Karashima, S. et al. A hyperaldosteronism subtypes predictive model using ensemble learning. Sci. Rep. 13, 3043 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Mobley, B. A., Schechter, E., Moore, W. E., McKee, P. A. & Eichner, J. E. Predictions of coronary artery stenosis by artificial neural network. Artif. Intell. Med. 18, 187–203 (2000).
Article CAS PubMed Google Scholar
Bivard, A., Churilov, L. & Parsons, M. Artificial intelligence for decision support in acute stroke—Current roles and potential. Nat. Rev. Neurol. 16, 575–585 (2020).
Article PubMed Google Scholar
Prosperi, M. et al. Causal inference and counterfactual prediction in machine learning for actionable healthcare. Nat. Mach. Intell. 2, 369–375 (2020).
Article Google Scholar
Ohlsson, H. & Kendler, K. S. Applying causal inference methods in psychiatric epidemiology: A review. JAMA Psychiatry 77, 637–644 (2020).
Article PubMed PubMed Central Google Scholar
Li, J. X. et al. Machine learning identifies prominent factors associated with cardiovascular disease: Findings from two million adults in the Kashgar Prospective Cohort Study (KPCS). Glob. Health Res. Policy 7, 48 (2022).
Article CAS PubMed PubMed Central Google Scholar
Yang, L. et al. Study of cardiovascular disease prediction model based on random forest in eastern China. Sci. Rep. 10, 5245 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhu, Q. et al. A model for risk prediction of cerebrovascular disease prevalence-Based on community residents aged 40 and above in a City in China. Int. J. Environ. Res. Public Health 18, 6584 (2021).
Article PubMed PubMed Central Google Scholar
Longstreth, W. T. Jr. et al. Lacunar infarcts defined by magnetic resonance imaging of 3660 elderly people: The cardiovascular health study. Arch. Neurol. 55, 1217–1225 (1998).
Article PubMed Google Scholar
Conti, C. R., Bavry, A. A. & Petersen, J. W. Silent ischemia: Clinical relevance. J. Am. Coll. Cardiol. 59, 435–441 (2012).
Article PubMed Google Scholar
Bernick, C. et al. Silent MRI infarcts and the risk of future stroke: The cardiovascular health study. Neurology 57, 1222–1229 (2001).
Article CAS PubMed Google Scholar
Vermeer, S. E. et al. Silent brain infarcts and white matter lesions increase stroke risk in the general population: The Rotterdam scan study. Stroke 34, 1126–1129 (2003).
Article PubMed Google Scholar
Debette, S. et al. Association of MRI markers of vascular brain injury with incident stroke, mild cognitive impairment, dementia, and mortality: The Framingham Offspring Study. Stroke 41, 600–606 (2010).
Article PubMed PubMed Central Google Scholar
Gottlieb, S. O. & Gerstenblith, G. Silent myocardial ischemia in the elderly: Current concepts. Geriatrics 43, 29–34 (1988).
CAS PubMed Google Scholar
Fleg, J. L. et al. Prevalence and prognostic significance of exercise-induced silent myocardial ischemia detected by thallium scintigraphy and electrocardiography in asymptomatic volunteers. Circulation 81, 428–436 (1990).
Article CAS PubMed Google Scholar
Ambepityia, G. et al. Exertional myocardial ischemia in diabetes: A quantitative analysis of anginal perceptual threshold and the influence of autonomic function. J. Am. Coll. Cardiol. 15, 72–77 (1990).
Article CAS PubMed Google Scholar
Naka, M. et al. Silent myocardial ischemia in patients with non-insulin-dependent diabetes mellitus as judged by treadmill exercise testing and coronary angiography. Am. Heart J. 123, 46–53 (1992).
Article CAS PubMed Google Scholar
Aronow, W. S., Mercando, A. D. & Epstein, S. Prevalence of silent myocardial ischemia detected by 24-hour ambulatory electrocardiography, and its association with new coronary events at 40-month follow-up in elderly diabetic and nondiabetic patients with coronary artery disease. Am. J. Cardiol. 69, 555–556 (1992).
Article CAS PubMed Google Scholar
Nakao, Y. M. et al. Holter monitoring for the screening of cardiac disease in diabetes mellitus: The non-invasive Holter monitoring observation of new cardiac events in diabetics study. Diab. Vasc. Dis. Res. 12, 396–404 (2015).
Article CAS PubMed Google Scholar
Multiple Risk Factor Intervention Trial Research Group. Exercise electrocardiogram and coronary heart disease mortality in the multiple risk factor intervention trial. Multiple risk factor intervention trial research group. Am. J. Cardiol. 55, 16–24 (1985).
Article Google Scholar
Ekelund, L. G. et al. Coronary heart disease morbidity and mortality in hypercholesterolemic men predicted from an exercise test: The lipid research clinics coronary primary prevention trial. J. Am. Coll. Cardiol. 14, 556–563 (1989).
Article CAS PubMed Google Scholar
Laukkanen, J. A. et al. Exercise-induced silent myocardial ischemia and coronary morbidity and mortality in middle-aged men. J. Am. Coll. Cardiol. 38, 72–79 (2001).
Article CAS PubMed Google Scholar
Gibbons, L. W., Mitchell, T. L., Wei, M., Blair, S. N. & Cooper, K. H. Maximal exercise test as a predictor of risk for mortality from coronary heart disease in asymptomatic men. Am. J. Cardiol. 86, 53–58 (2000).
Article CAS PubMed Google Scholar
Itoh, H. Metabolic domino: New concept in lifestyle medicine. Drugs Today (Barc.) 42(Suppl C), 9–16 (2006).
CAS PubMed Google Scholar
Fukuma, S. et al. Association of the National Health Guidance Intervention for obesity and cardiovascular risks with health outcomes among Japanese men. JAMA Intern. Med. 180(1630), 1637 (2020).
Google Scholar
Krogsbøll, L. T., Jørgensen, K. J., Grønhøj Larsen, C. & Gøtzsche, P. C. General health checks in adults for reducing morbidity and mortality from disease: Cochrane systematic review and meta-analysis. BMJ 345, e7191 (2012).
Article PubMed PubMed Central Google Scholar
Jørgensen, T. et al. Effect of screening and lifestyle counselling on incidence of ischaemic heart disease in general population. Interiors99 randomised trial. BMJ 348, g3617 (2014).
Article PubMed PubMed Central Google Scholar
Fukuda, H. et al. Chewing number is related to incremental increases in body weight from 20 years of age in Japanese middle-aged adults. Gerodontology 30, 214–219 (2013).
Article PubMed Google Scholar
Ekuni, D., Furuta, M., Takeuchi, N., Tomofuji, T. & Morita, M. Self-reports of eating quickly are related to a decreased number of chews until first swallow, total number of chews, and total duration of chewing in young people. Arch. Oral Biol. 57, 981–986 (2012).
Article PubMed Google Scholar
Zhu, Y. & Hollis, J. H. Relationship between chewing behavior and body weight status in fully dentate healthy adults. Int. J. Food Sci. Nutr. 66, 135–139 (2015).
Article PubMed Google Scholar
Zhu, Y. & Hollis, J. H. Increasing the number of chews before swallowing reduces meal size in normal-weight, overweight, and obese adults. J. Acad. Nutr. Diet 114, 926–931 (2014).
Article PubMed Google Scholar
Cassady, B. A., Hollis, J. H., Fulford, A. D., Considine, R. V. & Mattes, R. D. Mastication of almonds: Effects of lipid bioaccessibility, appetite, and hormone response. Am. J. Clin. Nutr. 89, 794–800 (2009).
Article CAS PubMed Google Scholar
St-Onge, M. P. et al. Sleep duration and quality: Impact on lifestyle behaviors and cardiometabolic health: A scientific statement from the American Heart Association. Circulation 134, e367–e386 (2016).
Article PubMed PubMed Central Google Scholar
Zuraikat, F. M., Wood, R. A., Barragán, R. & St-Onge, M. P. Sleep and diet: Mounting evidence of a cyclical relationship. Annu. Rev. Nutr. 41, 309–332 (2021).
Article CAS PubMed PubMed Central Google Scholar
Jorge-Samitier, P., Fernández-Rodrigo, M. T., Juárez-Vela, R., Antón-Solanas, I. & Gea-Caballero, V. Management of hypnotics in patients with insomnia and heart failure during hospitalization: A systematic review. Nurs. Rep. 11, 373–381 (2021).
Article PubMed PubMed Central Google Scholar
Gaines, J., Vgontzas, A. N., Fernandez-Mendoza, J. & Bixler, E. O. Obstructive sleep apnea and the metabolic syndrome: The road to clinically-meaningful phenotyping, improved prognosis, and personalized treatment. Sleep Med. Rev. 42, 211–219 (2018).
Article PubMed PubMed Central Google Scholar
Hernán, M. A. & Robins, J. M. Causal Inference: What If (Chapman & Hall/CRC, 2020).
Google Scholar
Shimizu, S. et al. DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model. J. Mach. Learn. Res. 12, 1225–1248 (2011).
MathSciNet Google Scholar
Genin, K. Statistical undecidability in linear, non-Gaussian causal models in the presence of latent confounders. In Proc. Causal Discovery & Causality-Inspired Machine Learning Workshop, NeurIPS (2020).
Nakao, Y. M. et al. Effectiveness of nationwide screening and lifestyle intervention for abdominal obesity and cardiometabolic risks in Japan: The metabolic syndrome and comprehensive lifestyle intervention study on nationwide database in Japan (MetS ACTION-J study). PLoS ONE 13, e0190862 (2020).
Article Google Scholar
Status of Implementation of Specific Health Check-Ups and Specific Health Guidance. https://www.mhlw.go.jp/stf/seisakunitsuite/bunya/newpage_00043.html.23. (2021).
Ke, G. et al. LightGBM: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 1, 3146–3154 (2017).
Google Scholar
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Article Google Scholar
Cox, D. R. The Regression Analysis of Binary Sequences [with Discussion] (1958).
Chen, T. & Guestrin, C. XGBoost: A Scalable Tree Boosting System (2016).
Altmann, A., Toloşi, L., Sander, O. & Lengauer, T. Permutation importance: A corrected feature importance measure. Bioinformatics 26, 1340–1347 (2010).
Article CAS PubMed Google Scholar
Youden, W. J. Index for rating diagnostic tests. Cancer 3, 32–35 (1950).
Article CAS PubMed Google Scholar
Zheng, X., Aragam, B., Ravikumar, P. & Xing, E. P. DAGs with NO TEARS: Continuous optimization for structure learning. Preprint at http://arXiv.org/1803.01422.
Heckerman, D., Geiger, D. & Chickering, D. M. Learning Bayesian networks: The combination of knowledge and statistical data. Mach. Learn. 20, 197–243 (1995).
Article Google Scholar
Tsamardinos, I., Brown, L. E. & Aliferis, C. F. The max–min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65, 31–78 (2006).
Article Google Scholar

Download references

Acknowledgements

The authors thank Editage (Tokyo, Japan; www.editage.jp) for the English language editing.

Author information

Authors and Affiliations

School of Electrical Information Communication Engineering, College of Science and Engineering, Kanazawa University, Kanazawa, Japan
Riku Nambo
Institute of Liberal Arts and Science, Kanazawa University, Kanazawa, Japan
Shigehiro Karashima
Department of Health Promotion and Medicine of the Future, Kanazawa University, Kanazawa, Japan
Ren Mizoguchi, Seigo Konishi, Atsushi Hashimoto, Daisuke Aono, Mitsuhiro Kometani & Takashi Yoneda
Health Care Center, Japan Advanced Institute of Science and Technology, Nomi, Japan
Kenji Furukawa
Faculty of Electrical, Information and Communication Engineering, Institute of Science and Engineering, Kanazawa University, Kanazawa, Japan
Kousuke Imamura
Institute of Transdisciplinary Sciences, Kanazawa University, Kanazawa, Japan
Hidetaka Nambo

Authors

Riku Nambo
View author publications
You can also search for this author in PubMed Google Scholar
Shigehiro Karashima
View author publications
You can also search for this author in PubMed Google Scholar
Ren Mizoguchi
View author publications
You can also search for this author in PubMed Google Scholar
Seigo Konishi
View author publications
You can also search for this author in PubMed Google Scholar
Atsushi Hashimoto
View author publications
You can also search for this author in PubMed Google Scholar
Daisuke Aono
View author publications
You can also search for this author in PubMed Google Scholar
Mitsuhiro Kometani
View author publications
You can also search for this author in PubMed Google Scholar
Kenji Furukawa
View author publications
You can also search for this author in PubMed Google Scholar
Takashi Yoneda
View author publications
You can also search for this author in PubMed Google Scholar
Kousuke Imamura
View author publications
You can also search for this author in PubMed Google Scholar
Hidetaka Nambo
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SK and HN designed the study and evaluated and edited the manuscript. SeiK, AH, DA, MK, and TY supervised the consultations. KF collected the data. RN and RM performed the statistical analyses. RN, SK, RM, AH, KI, and HN contributed to the analysis and interpretation of the data. RN, SK, and HN wrote the manuscript. SK, TY, KI, and HN supervised the study. All authors have checked and approved the final version of the manuscript.

Corresponding authors

Correspondence to Shigehiro Karashima or Hidetaka Nambo.

Ethics declarations

Competing interests

K. Furukawa received lecture fees from Sanofi K.K., Eli Lilly Japan K.K., and Ono Pharmaceutical Co., Ltd. A. The remaining authors declare no conflicts of interest with respect to this research or paper.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Figure S1.

Supplementary Figure S2.

Supplementary Figure S3.

Supplementary Information.

Supplementary Tables.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Nambo, R., Karashima, S., Mizoguchi, R. et al. Prediction and causal inference of cardiovascular and cerebrovascular diseases based on lifestyle questionnaires. Sci Rep 14, 10492 (2024). https://doi.org/10.1038/s41598-024-61047-w

Download citation

Received: 14 October 2023
Accepted: 30 April 2024
Published: 07 May 2024
DOI: https://doi.org/10.1038/s41598-024-61047-w
Springer Nature Limited

Prediction and causal inference of cardiovascular and cerebrovascular diseases based on lifestyle questionnaires

Abstract

Similar content being viewed by others

Explore related subjects

Introduction

Results

Participant characteristics

Feature importance ranking

Predictive performance

Causal inference

Discussion

Materials and methods

Study participants

Features

Dataset construction

Statistical analysis for clinical background

General process of prediction model construction

Construction and integration of causal inference

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation