Introduction

Chronic diseases that are decisively affected by lifestyle choices are the leading causes of death globally [1]. In 2010, studies suggested that smoking, including inhalation of secondhand smoke, was responsible for 6.3% of the global disease burden, whereas alcohol consumption accounted for 3.9%, and diet-related risk factors and physical inactivity combined accounted for 10% of global disability-adjusted life years (DALYs) [2]. The adverse effects of these modifiable risk behaviors account for huge societal and monetary costs [3]. Research suggests that modifiable risk behaviors cluster within individuals [4], which imply that they are not randomly dispersed over the population but rather tend to cluster in some individuals. These modifiable risk behaviors appear to be associated with a substantially increased risk of mortality when combined [5], which suggests that their health effects are multiplicative rather than additive. Therefore, looking at the clustering patterns of modifiable risk behaviors is essential since this information provides insights into their etiology.

Sociodemographic factors reflect a structural position that shapes the practice of health-behavior [6]. In addition, sociodemographic characteristics provide the setting in which behavior can turn into habits that support certain types of lifestyles related to health. An increasing number of studies have explored modifiable risk behaviors that appear in clusters within specific sociodemographic groups [4]. However, to our knowledge, only a limited number of studies [7, 8] have explored sociodemographic deviations in the clustering of these modifiable risk behaviors so as to identify the groups that are most at risk. Further, these studies recommend that proximate factors, such as education and income, need to be explored since they also have major implications for behavior related to health. Adults with low socioeconomic status (SES) are expected to engage in risky health-behaviors [9], thus increasing their susceptibility to poor health. Low SES impacts an individual’s health for several reasons, including a lack of access to health care, substandard living conditions, an inadequate understanding of the negative outcomes of health-compromising behavior, and high levels of psychological stress [9].

Information on health-behavior clustering may help with designing disease prevention programs. Interventions that focus on single risk behaviors do little to ensure continuing changes in health-behavior or health-related outcomes [10]. Furthermore, knowing the pattern of a modifiable risk behavior cluster will help health professionals plan more powerful intervention strategies on the grounds that interventions on multiple behaviors affect public health to a greater degree than an intervention focused on a single behavior [11]. Variations observed in the literature on this topic have been attributed to different methodologies and to the differences in health-behavior that were investigated. Prior studies have applied latent class analyses to characterize the clustering of modifiable risk behaviors [4, 12]; however, only a few studies have investigated the association between latent classes of health-behavior and health outcomes, such as all-cause mortality [8]. Onge and colleagues (2017) found that health-behavior classes are related to prospective mortality, implying that they are valid representations of a given population [8]; however, this study lacked information related to key behaviors, such as diet and SES, that have implications on health-behavior. Proper nutrition and dietary habits add better quality of life in later life through the reduction of risk associated with many chronic conditions [13]. Dietary habits cluster with other risk behaviors in a complex way, resulting in both healthy and unhealthy groups [14]. To promote dietary behavior change, diet must be included as key behavior to explore clustering patterns and their relationship with health outcomes. Therefore, it is important to determine how to characterize key modifiable risk behavior clusters and determine their relationship to mortality. This study thus aims toward (1) using of a latent class analysis (LCA) to identify latent classes of modifiable risk behavior, (2) exploring the correlation between sociodemographic factors and identified classes, and (3) further assessing the association of the identified latent classes with all-cause mortality.

Methods

Setting and Participants

The data used in the present study were obtained from a large perspective cohort study in Taiwan conducted between 1998 and 2019. Participants had joined a standardized service fee medical screening program provided by a private firm (MJ Health Management Institution, Taipei, Taiwan) [15, 16]. Every participant received a series of examinations, including anthropometric measurements, a physical examination, and a biochemical test of blood and urine. In addition, a self-administered questionnaire was filled out by the participants to obtain their medical history and lifestyle information. All participants visited on a yearly basis, and the same questionnaires were filled out on every visit. All behavioral and sociodemographic variables included in the study were collected at baseline. Study participants include 290,279 adults aged 21 years and older who had at least one check-up during the period from 1998 to 2006. Further information on the data collection is provided in Supplementary File 1. We linked our dataset with the mortality data of our study participants obtained from the Ministry of Health and Welfare Taiwan to obtain the information on all-cause mortality. The study participants provided informed written consent to participate and gave permission to process the data from their medical screening [17]. The study protocols were approved by the institutional review board of the National Cheng Kung University Hospital (IRB No. A-ER-109011).

Health-Related Measures

The study included variables related to lifestyle behavior comprising smoking status, alcohol consumption, sleep pattern, physical activity, and dietary intake. All of these indicators were included in the analysis. Firstly, the indicators were modeled as categorical variables to measure health-behavior clusters. Categorical indicators were used instead of binary indicators because categorical indicators can be used to characterize individuals across the process of functioning as well as to identify those who have subclinical levels of risk. Smoking status was assessed by categorizing cigarette smoking as follows: does not smoke, previous smoker but has quit, does not smoke but is exposed to passive smoke, occasionally smokes, and smokes every day. Alcohol consumption was assessed by categorizing alcohol consumption as follows: does not drink, previous drinker but has quit, occasional drinking, and daily drinking. Sleep pattern was measured using sleep duration, which was categorized as follows: less than 6 h, 6 to 8 h, and more than 8 h.

Dietary intake was measured using fruit and vegetable consumption. The respondents were asked how many servings of fruits and vegetables they ate per day. The frequently recommended five servings of fruits and vegetables per day were considered the minimum [18]. Based on guidelines from Nutrition Security and Optimal Dietary Intake in Taiwan, one serving of vegetables is equivalent to an uncooked edible serving of about 100 g, which is similar to cooked vegetables in a dish (diameter: 15 cm, or about the size of a disc) or about half a bowl [19]. Likewise, one serving of fruit is approximately equivalent to a fist-sized portion or one rice bowl filled with cut fruit [19]. Therefore, we categorized fruit and vegetable intake as less than 1 serving, 1–2 servings, 2–3 servings, 3–4 servings, and more than 4 servings.

To calculate activity intensity and energy expenditure in kilocalories, the metabolic equivalent (MET) (kcal/kg/h) was used [20]. All respondents who indicated activities in more than one intensity category were assigned an average MET value. The activity MET score of a respondent was calculated as the product of the average MET and duration. Further, to calculate the total MET score, we added the products across all activities. Therefore, total physical activity (PA) was measured based on categorizing the MET scores into four subgroups: highly active, active, insufficiently active, and inactive. Detailed information on how these indicators were categorized and the assigned cut-off scores is available in Supplementary File 2 (File 2: Measurement and cut-off).

Statistical Analysis

The analyses were performed using STATA 15 and Latent Gold 5.0. Latent class analysis was conducted using Latent Gold version 5.0, as LCA enables the characterization of unobserved variables starting from an analysis of the relationship among several observed variables using maximum likelihood estimation methods [21]. It is a confirmatory method used to test hypotheses regarding a priori assertions about the structure of the relationship among the observed variables [22]. When compared with alternative approaches, such as cluster analyses, LCA has advantages including estimating the population characteristics derived from the sample data, adjusting to the estimated measurement error, and determining the number of classes. Moreover, it provides probabilities that can be used for the interpretation of results and flexible treatment of variance among classes [23].

To obtain the appropriate number of classes and maximize the model fit, initially, we started with a two-class model and progressively expanded the quantity of classes by one, up to a seven-class model. In order to select the model, fit, and interpretability of the model, the LL (log likelihood), AIC (Akaike information criterion), CAIC (consistent Akaike information criterion), BIC (Bayesian information criterion), and adjusted BIC (adjusted Bayesian information criterion) were examined across all models, followed by identifying where the lowest values occurred across those models [24, 25]. Furthermore, LCA provides an estimation of class membership probabilities. After the best model fit was obtained, the output data were imported back to STATA for further analysis, and the association between class membership and sociodemographic factors were examined using a multinomial logistic regression. We also calculated time to event from the date of enrollment to the date of death or the end of the cohort follow-up (i.e., September 31, 2019), whichever came first. Further, a Cox proportional hazard regression analysis was used to find the association between the latent class of health-behavior and all-cause mortality.

Results

Descriptive Characteristics of the Sample

A total of 290,279 respondents participated in this study. The baseline characteristics of the participants are presented in Table 1. The sample consists of 51.11% females versus 48.89% males, with a total mean age of 40 (SD = 12.4). Of the total sample, 23% were unmarried, 70% were married, 2.5% were divorced, and 3.8% were widowed. Furthermore, 3.2% of the participants were illiterate, 17.9% had a junior high school level of education, 24% had a high school level of education, and 54% had an undergraduate level of education or above. Nearly 65% of the sample participants had never smoked cigarettes, while 19% of the participants smoked daily. Of the participants, 81% were nondrinkers and 51% were inactive. Furthermore, a low percentage of the participants were following fruit and vegetable consumption norms.

Table 1 Participant characteristics

Goodness of Fit and Description of Latent Classes

As illustrated in Table 2, the seven-class model was the best fit, representing an adequate solution for the data because it had the lowest BIC and CAIC values. However, accuracy decreased as the sample size increased, which is a known problem with AIC because there is no adjustment for sample size [26]. Furthermore, entropy was also reported only to demonstrate the precision with which the cases were classified in the profiles (on a 0 to 1 scale). However, entropy does not serve as a main indicator by which to determine the optimal number of profiles [26]. In the case of the categorical and continuous variables, BIC and CAIC could correctly identify the correct class model close to 100% of the time [27]. Therefore, in Table 2, BIC and CAIC are shown to suggest Model 7 as the best fitting model.

Table 2 Goodness of fit for latent class models (n = 290,279)

As shown in Table 3, Class 1 was the most prevalent class with regard to all five health-behaviors, accounting for 34.9% of the sample, where health-behavior was characterized as “inactive, secondhand smoker, and low dietary intake.” Class 1 was distinct due to a higher proportion of adults who were inactive and relatively frequently exposed to secondhand smoke. Class 2 accounted for 25.2% of the sample and was characterized as “nondrinker, adequate sleep, and somewhat active.” This class was distinct due to a higher proportion of adults who never drink and who sleep 6–8 h a day. Classes 3 and 4 accounted for 13.4% and 12.4% of the sample, respectively. Class 3 was characterized as “nonsmoker, nondrinker, and higher dietary intake.” Compared to other classes, Class 3 was distinct due to having the highest share of adults who never smoke and eat more vegetables and fruits than was the case for the other classes. Class 4 had health-behavior characterized as “casual smoker, casual drinker, and somewhat active.” This class was distinct due to a higher share of adults who are occasional smokers, as well as drinkers, and are somewhat active.

Table 3 Prevalence of latent classes and conditional probabilities within each latent class

Classes 5 and 6 accounted for 9.8% and 2.9% of the sample, respectively. Class 5 had health-behavior characterized as “daily smoker, daily drinker, inactive, and low dietary intake,” and class 6 had health-behavior characterized as “daily smoker, occasional drinker, and highly active.” Class 5 was distinct due to a higher proportion of adults who smoke and drink every day and eat less than one serving of fruits and vegetables. Further, Class 6 was distinct due to a higher share of adults who were highly active. Class 7 accounted for only 1% of the sample population, characterized as “previous smoker, previous drinker, moderate sleep, and inactive.” This class included the largest share of adults who were previous smokers and drinkers but had quit. It was also distinct due to comprising a higher share of adults who have good sleep based on the norm. In Supplementary File 3, graph clearly shows the prevalence of latent classes and the conditional probabilities within each class.

Association Between Sociodemographic Characteristics and Latent Class Membership

Table 4 shows the sociodemographic association with latent class membership, using the class with the least unhealthy behavior as the referent (Class 3). In addition, Supplementary File 4 includes Table 4.1, which provides the full descriptive statistics for each group. Women had higher odds of belonging to Class 1 and Class 2. Class 5 members were 86% more likely to be young adults than referent class (Class 3) members. Class 5 members had 74% higher odds of being illiterate compared with Class 3. Compared with the referent class, Class 4 was 60% more likely to have a high school level of education, and Class 2 was 56% more likely to have an undergraduate level of education or above.

Table 4 Sociodemographic characteristics predicting latent classes

Class Membership and Survival

During the follow-up period, 19,350 deaths occurred over the 5,079,699 person-years under observation. Cox proportional hazards models were calculated, for which the hazard ratios for mortality are shown in Table 5. In addition, in Supplementary File 5, Table 5.1 provides the full descriptive statistics for each group. Model 1 shows the risk estimate for all-cause mortality according to engagement in the various health-behavior classes without adjusting for demographic factors, whereas the other models were adjusted for different demographic factors. Model 1 (unadjusted) shows that compared with referent Class 3 adults, Class 1, Class 2, and Class 3 adults had a 51%, 58%, and 41% lower risk of mortality, respectively. Model 4 was further adjusted for demographic variables including age, gender, education, and personal income, and, when compared with the referent class, adults in Class 2 had a 4% lower risk of mortality. Similarly, the risk of mortality for adults in Class 5 was 1.78 times higher than for the referent class. Finally, through a comparison with the participants in Class 3, the risk of mortality for the participants in Classes 6 and 7 was 1.23 and 1.77 times higher, respectively.

Table 5 All-cause mortality risk according to engagement in different latent class

Discussion

This is the largest population-based sample identifying clustering of modifiable risk behaviors in relation to all-cause mortality. To the best of our knowledge, this is the first study to investigate the existence of lifestyle behavior clusters and to assess all-cause mortality in an adult Asian population. Seven latent classes were identified, which accounted for different modifiable risk behaviors. The risk behavior and demographic profile for each of these classes were distinct from one another. Our findings suggest that a clustering pattern between modifiable risk behaviors occurs among adults and that all-cause mortality increases with an increase in the number of modifiable risk behaviors. It was found that all of the adults engaged in at least one of the modifiable risk behaviors, and all seven classes were characterized with a 100% likelihood of having at least one unhealthy behavior coupled with the likelihood of having another four unhealthy risk behaviors. When examining the prevalence of multiple risk factors, only 13% of our sample occupied a class engaged in the maximum number of healthy behaviors.

This study also showed a demographic correlation with identified latent classes. Some research has shown no gender differences in terms of risk behavior [28], whereas other research [29] has shown that gender differences exist, which is similar to the findings in the present study. Previous studies have examined the clustering pattern of health-behaviors in various settings, such as among vocational education students [30], at-risk adult populations in the U.K [31], and adults with SNAP health risk behaviors (smoking, poor nutrition, excess alcohol consumption, and physical inactivity) [32]. In a review by Meader et al., they found that alcohol consumption and smoking were the most identified risk behavior cluster, which is similar to our study’s findings [31]. However, another study found that males and those with greater social disadvantages engaged in riskier health-behaviors [32]. The present study showed that women belonged to the class 1 having the lowest prevalence of alcohol consumption but were highly physically inactive, whereas a study by Atorkey et al. found that women were less likely to engage in hazardous drinking and tobacco smoking but were more likely to engage in hazardous drinking and physical inactivity [30]. Some research points out several possible explanations for this gender effect. It has been suggested that women receive less social support for involvement in physical activities [33], and other researchers have found correlations between the physical and social environment and gender differences in terms of PA [34]. Our findings suggest that we need to emphasize PA among people (especially women) using environmental approaches, such as enhancing physical and built environments, rethinking community designs, and ensuring access to places in which to engage in such activities [35]. In this study, it was simultaneously observed that there were an increased number of health risk behaviors among young adults. As seen in previous studies, the prevalence of multiple risk behaviors was high in young adults [36], which was further explained by the fact that young adults gain liberty and social and economic independence with an age [37] that favors access to places that sell alcohol, cigarettes, and junk food.

Even though previous research has demonstrated a social gradient of health in terms of mortality or morbidity outcomes, very few papers have looked at the association between SES and health-behavior clustering. Based on our findings, individuals with low levels of education had higher odds of engaging in the maximum number of risk behaviors (Class 5) compared to those who had higher levels of education. Prior studies that emphasized educational status also showed that people with higher education levels had higher adherence to health-behavior norms than those with less education [38]. Although lower SES has been associated with an increased number of risk behaviors [39], this pattern has not always been found associated with low SES, and it has been found to increase, decrease, or be unrelated to health-behavior [40]. The relationship of low SES with multiple health-behaviors has been attributed to multiple factors, for example, less access to fitness facilities, less knowledge about proper nutrition, unsafe living environments, less access to health care, and a scarcity of fresh fruits and vegetables [41]. Thus, a holistic approach is required, where health policies provide more health education and promotions that take into consideration the socio-environmental factors and barriers that hinder individuals from engaging in healthy behavior.

Unhealthy behavior does not occur by itself. Our data showed that people who practice one type of unhealthy behavior are likely to engage in other unhealthy behaviors. In this study, the most prevalent combination of modifiable risk behavior comprised physical inactivity co-occurring with a diet low in fruits and vegetables. For example, Class 1 (inactive, secondhand smoker, and low dietary intake) and Class 5 (daily smoker, daily drinker, low dietary intake, and inactive) had the highest prevalence of physical inactivity combined with a diet low in fruits and vegetables. These two groups both had a higher likelihood of mortality than Class 3 (nondrinker, nonsmoker, more dietary intake, and active), where people engaged in PA and had a high intake of fruits and vegetables. This is consistent with the results from studies carried out in other countries, including the USA [42], where physical inactivity and low fruit and vegetable intake were found the most predominant co-occurring behaviors, although the clustering investigation varied based on the target population [30]. A review by Meader and colleagues indicated that co-occurrence data showed a particularly high prevalence for low fruit and vegetable intake and low PA, which was consistent with our findings [31]. For these classes, interventions that combine energy expenditure along with nutritional strategies are needed. The co-occurrence of unhealthy behaviors may be related to individual personal characteristics or situations that facilitate unhealthy situations, which may be derived from genetics, family experiences, and consistent peer pressure [43]. Individuals who are indiscreet and have related challenges in self-regulation are bound to wind up in environments that advance multiple unhealthy behaviors [43]. Therefore, it is imperative to disentangle how these key processes meet up and influence one another, both across brief timeframe periods and over one’s lifetime.

The class that contained the maximum number of unhealthy behaviors (Class 5: daily smoker, daily drinker, low dietary intake, and inactive) represented 10% of the sample, and this population was deemed to be the highest risk for all-cause mortality. Further, the Class 5 profile was consistent with recent work suggesting that 16.6% of deaths can be ascribed to engaging in the four modifiable risk behaviors (i.e., smoking, drinking alcohol, low dietary intake, and physically inactive) [44]. In contrast with the present study, it is clear that these cumulative unhealthy behaviors can lead to the worst outcomes in young adults with low SES (particularly in the case of men). Therefore, the findings of this study suggest that there is potential for interventions aimed toward multiple risk behaviors, either successively or simultaneously, when there is evidence of clustering. Furthermore, there is potential for mediating at the social or environmental level because of the solid relationship with SES.

Inadequate sleep increased the risk of death in our sample, which is similar to the findings of previous studies [7, 45]. From our findings, individuals in Class 2 (adequate sleep, nondrinker, and partially active) with a high prevalence of partial inactivity and adequate sleep were 4% less likely to die as compared to those in Class 3 (nondrinker, nonsmoker, more dietary intake, and active) who are sleeping less but active. A meta-analysis of prospective studies showed that people who reported consistently sleeping 5 h or less per night should be regarded as a higher risk group for all-cause mortality [45]. Further, a U-shaped association between sleep duration and all-cause mortality with the lowest risk at 7 or 8 h of sleep has been reported in many studies [46, 47]. Currently, the recommended hours for good sleep for adults vary. Based on the American Academy of Sleep Medicine (AASM) and the Sleep Research Society (SRS), it is suggested that adults get at least 7 h of sleep every night to dodge the health risks of chronic inadequate sleep [48]. However, the National Sleep Foundation of America indicates that it varies across lifespan and from person to person.

Strengths and Limitations

The present study’s main strengths are its focus on key health-related behaviors in a large population-based sample and its inclusion of health-related measures that have been shown to have strong associations with mortality. Further, the use of a person-centered approach (through LCA) to identify distinct health-behavior classes rather than focusing on linear associations among risk factors strengthened the study’s approach. The LCA perspective provides important insights into how disease prevention programs may be targeted for or tailored toward different subgroups to improve their effectiveness.

Although based on a large sample of Taiwanese adults, the present study is subject to limitations. First, in the study, the health-behavior-related data are cross-sectional, but to establish a causal direction for observed effects, we linked it with all-cause mortality. We were unable to decide if a group ages out of explicit classes since health-related behavior frequently begins early in life and may eventually lead to permanent behavioral patterns. Second, the data were based on self-reporting, which may have resulted in information bias. Furthermore, we did not use income as an indicator for LCA because people tend not to report their income correctly. Third, the study did not control for other unobserved confounding factors that could result in unhealthy behavior. The current study controlled for age, gender, and education but not for other predisposing factors, such as genetic composition and level of psychological distress. For example, psychological distress may contribute to mortality [49]. In the general population, psychological distress is often associated with multiple health risk behaviors [50]. Without controlling for psychological distress, we were not able to establish a causal relation between the clustering of modifiable risk behaviors and mortality, but rather we could only show an association. Further data exploration is needed to establish the role of psychological distress in early adulthood in the relationship between modifiable risk behaviors and mortality.

Conclusion

The study’s findings highlight the impact of modifiable risk behaviors on mortality. There was a clear clustering pattern of modifiable risk behaviors among the adults under consideration, where the risk of mortality increased with increases in unhealthy behaviors. Therefore, classes with individuals who are at high risk need health-related interventions because if interventions can be demonstrated to be viable in preventing and/or reducing multiple health-behaviors, in future years, this could assist with preventing an escalation of chronic health issues within low- and middle-income countries. Multi-component interventions that incorporate education, advice, counseling, and skill training should be delivered in various settings, including healthcare practices/clinics, workplaces, fitness centers, community centers, and university campuses. Further, this study’s findings suggest that men, younger individuals, and those in a low socioeconomic class should be targeted for multiple behavioral interventions since these groups appear to be the most at risk. The current study’s findings have provided insights on the etiology of the adult population’s mortality due to the clustering patterns of modifiable risk behaviors, which can provide strong empirical support for health prevention policies intended to improve the behavioral risk profile.