Background

Dental caries is a prevalent and widely distributed tooth hard tissue diease. According to the World Health Organization, it is one of the most common diseases, along with cardiovascular disease and tumors, that must be prevented and controlled [1, 2]. In the US, 27% of individuals under the age of 64 have untreated tooth decay, while 91% of adults have dental caries [3]. The cost of preventing and treating caries is a global economic burden [2]. For instance, the projected spending for treating dental disease was $122 billion in 2014 (Centers for Medicare and Medicaid Services 2011) [3].

Caries risk is influenced by multiple factors, including biological factors [4,5,6,7] and abiotic factors (such as deciduous dental caries [8], family influences [9], socio-economic status [10], BMI [11], and dietary habit [12]). Scientific and epidemiological data suggest a lifelong synergy between diet, nutrition, and the integrity of the oral cavity in health and disease [13]. Higher diet quality is related to a lower index of decayed, missing, and filled teeth (DMFT) [14]. It is currently accepted that caries is a sugar and biofilm-dependent disease. The pathogenesis is well understood: bacteria in dental plaque (biofilm) metabolize dietary sugars to acids that dissolve dental enamel and dentine [15]. Plaque bacteria produce acids from the metabolism of fermentable carbohydrates that lead to the demineralization of tooth enamel and enzymes that attack the protein component of the tooth, resulting in decay. Sugar is commonly found in bread, chocolate, drinks, fruits, etc. An unhealthy diet will cause people to consume excessive sugar. Moreover, other dietary factors besides sugar can also affect caries. For example, one study showed that a lack of vitamin D in the diet increased the risk of dental caries [16]. The intake of milk can reduce the caries risk to a certain extent [17]. Therefore, it is of great significance to reveal the association between diet and caries for reducing caries risk.

Due to the significant influence of diet on caries risk, researchers worldwide have conducted quite a few studies in recent years. However, the frequency of eating may be more closely related to the experience of caries than the amount of food consumed [18, 19]. A study of 4,467 individuals in the United States, using a principal component analysis of data from a 24-h dietary recall, reduced eating patterns to three and found that a diet of "high-sugar drinks and sandwiches" was associated with the prevalence of DMFT [20]. Another analysis of dietary frequency studies found that American children who ate fewer than five servings of fruits and vegetables a day were more likely to develop cavities in their primary teeth (OR = 3.21, P < 0.05) [21]. Different foods have bidirectional effects on the occurrence of caries. A study of 4,111 New Zealand children found that higher consumption of ice cream, noodles, rice porridge and refined breakfast cereals was positively associated with dental caries [17]. At the same time, some studies have revealed a positive effect of specific diets on caries risk. A novel analysis from Sanders and colleagues represented that an increased intake of long-chain omega-3 fats, whole grains, and vegetables (excluding potatoes) was inversely associated with dental caries [22]. Related research has also been carried out in China. Qin conducted a study on caries in children aged 10–12 in Guangdong province, showing that those who ate desserts or chocolate more than twice a day were more likely to have caries [23]. Those who ate sweets at least once a day had 69.8% of dental caries, compared with 57.5% for those who ate sweets less than once a day (P = 0.027) [24].

The researchers used many statistical analysis methods to explore the potential association between dietary patterns and caries experience. Multinomial logistic regression was used to study the influencing factors of dental caries among adolescents in Guangdong province by Li and colleagues, and found that a high frequency of sweet milk, tea, or coffee was a risk factor for dental caries [24]. In the meantime, a binary logistic regression model showed that the caries prevalence was not associated with cake or dessert intake but with flavored milk or yogurt and honey [25]. Although these studies effectively demonstrate a potential link between the frequency of consumption of certain foods and the caries prevalence, a person's eating habits are composed of many food types [26]. There are complex interactions and potentially cumulative relationships between foods [27]. Moeller argues that it is not advisable to study the separate effects of each diet on disease [28]. Schulze 's study also showed that if studying the effect of a single food on disease separately, the effect of a single food on disease separately is difficult due to multicollinearity [29]. Compared with traditional statistical analysis methods, the cluster analysis method has an advantage in solving this problem. It can stratify samples into diverse groups according to individual characteristics [30, 31].

The up-to-date study in China lacked a comprehensive analysis of the relationship between dietary patterns and caries experience. The study on the relationship between dietary patterns and dental caries experience is beneficial in guiding adolescents to improve their dietary structure and reduce the risk of dental caries. As a result, this study used descriptive statistics and cluster analysis to analyze the caries experience of adolescents aged 12–15 in different areas of Shanxi Province. The aim of this study is to reveal the potential link between the dietary patterns of adolescents and the risk of dental caries, thus guiding for the government and school canteens to improve policies and citizens' nutritional habits in the future.

Methods

Study design and participants

This survey was conducted with reference to the WHO Oral Health Surveys [32], the Basic Methods (5th ed), and the Oral Health Survey Test Methods issued by the former National Health and Family Planning Commission in 2015. The data of this study were based on the first oral epidemic survey in Shanxi Province in 2018. The protocol was reviewed and approved by the Ethics Committee of Shanxi Medical University. All the guardians of the investigated personnel signed informed consent.

According to the characteristics of regional distribution and the principle of random sampling, 117 municipal districts, counties, and county-level cities in 11 administrative districts of Shanxi Province were sampled. The method of stratified random sampling was used to select samples from participants. We collected the list of middle schools in the survey districts (county/county-level city) and ranked middle schools according to the size of students aged 12–15. Probability Proportionate to Size Sampling (PPS) was used to select three middle schools in each city (county/district). We have selected a total of 36 schools in 12 cities (counties/districts) through multi-level stratified random sampling, taking into account the balance of regional and urban–rural distribution as well as the feasibility of implementation (Fig. 1). The sample size is calculated bases on the formula:

Fig.1
figure 1

The randomly sampled areas. (Compiled from the base map provided by Amap: https://lbs.amap.com/)

$$n=\frac{{Z}_{1-\alpha /{2}^{\times }}^{2}p(1-p)}{{\delta }^{2}}$$

where P is the dental caries rate among 12-year-olds according to the Fourth National Oral Health Epidemiological Survey in China [35], P = 34.5%. α = 0.05, δ = 0.1, and the Z value can be obtained by checking the normal distribution table: Z = 1.96. The unit nonresponse rate is calculated at 10%. The sampling design efficiency is deff = 4. The result is 95 students per age per school (n = 95),Therefore N = n*36*4 = 13,680. Ultimately, the sample included 13,680 individuals from 36 schools in 12 different cities (counties/districts) in Shanxi Province, China, who aged 12 to 15 and had lived locally for at least six months.

Data collection and data quality assurance

All dentists received theoretical knowledge training and passed the standard compliance test in early 2018 (kappa value greater than 0.8). The lead team for the epidemiologic investigation provided certificates to the examiners before their participation in the epidemiological investigation. The examiners performed oral examinations using WHO standards and disposable dental mirrors. They recorded the DMFT of each tooth in detail and conducted reexamination for a random 5% of samples on a daily basis. Caries diagnostic criteria for decayed, missing, and filled teeth (DMFT) were used to estimate dental caries prevalence.

Questionnaire

Each participant was asked to independently complete a questionnaire before undergoing an oral health examination. The questionnaire included basic information such as gender, date of birth, region, ethnic, and so on. Importantly, the questionnaire covered the frequency of consumption of the following foods:

  • Desserts and confectionery (biscuits, cakes, bread, chocolate, sugary mouth candy);

  • Sweet drinks (sugar water, carbonated drinks such as cola, fruit juices such as orange juice and apple juice, non-fresh fruit juices such as lemonade);

  • Sugar-sweetened milk (yogurt, milk powder, tea, soy milk, coffee);

  • Vegetables;

  • Fruits;

  • Coarse grains (corn, purple rice, sorghum, oats, buckwheat, wheat bran);

  • Protein foods (beans, eggs, meat, fish, animal offal).

Options for each meal questionnaire were rated on a scale of six, ranging from "Never or hardly ever" to "more than once a day". The questionnaire that answered at least 80% of all questions and 100% of the key questions is considered to be the complete interview. After the preliminary design of the questionnaire, we carried out a pretest. The results of reliability analysis showed that the Cronbach's coefficient α is 0.82 which means the questionnaire has good reliability.

Additional covariates

The covariates associated with caries affected the outcome. Therefore, they need to be analyzed in statistical analysis to try to remove their effect. Our study identified potential confounders based on prior literature. Thus, in addition to the above questions, the questionnaire also included the following items: family size (one-child or more-than-one-child family), residence place (rural and urban), ethnicity, and brushing habits (ranging from "never or hardly ever" to "more than twice a day"). By querying the local official website, we collected the following information: Normalized Difference Vegetation Index (NDVI) [33], mean annual precipitation, mean annual temperature, PM2.5 (Statistical Yearbook of Shanxi Province,2017), and Digital Elevation Model (DEM, Data based on ASTER GDEM).

Statistical analysis

This study used simple descriptive statistics (mean and ratio) for data processing. The chi-square test, t-test, and analysis of variance (ANOVA) were used to analyze the validity of the results. Multinomial logistic regression and cluster analyses were used to study diet and caries risks. P values of 0.05 or less were considered statistically significant. The researchers used multiple Microsoft Access entries to enter all data. SPSS 26.0 statistical software was used to analyze the data.

Based on monthly intake frequency, the scores of 7 dietary questionnaires were assigned proportionally to evaluate the relative effect of dietary intake frequency on the existence of dental caries. (Table 1) This assignment cannot accurately represent the actual frequency of food intake each month. However, we believe that this way of assigning values can still reflect the relative ratio of intake foods frequency.

Table 1 Frequencies assignment table

K-means clustering analysis was used to explore the dietary characteristics of valid samples. The statistical process of the k-means clustering algorithm is to divide N samples into K clusters so that the sample points within each group have high similarity. In contrast, the sample points between groups have low similarity. The similarity degree is calculated according to a cluster's average value of sample points. The theorem is to use the Euclidean distance as the similarity index, repeatedly calculate the clustering center, divide the data with high similarity into the same category, and finally divide all the samples into K clusters. Therefore, the initial K value is a crucial parameter, and determining the optimal K significantly influences the statistical results. In this study, we used the "elbow" method to determine the optimal K value, which is implemented by calculating the sum of squares of error (SSE):

$${\text{SSE }} = \sum\limits_{{{\text{i}} = 1}}^{{\text{k}}} {\sum\limits_{{{\text{j}} \in {\text{C}}_{{\text{i}}} }}^{{}} {\left| {{\text{j - }}\overline{C}_{{\text{i}}} } \right|^{2} } }$$

In this formula, SSE represents the error value, and the Ci is any one cluster. The j represents the samples contained in this cluster. Ci symbolizes the center of this cluster. For a cluster, the lower the SSE, the closer cluster members are to each other. In fundamental respects, SSE decreases as the number of clusters (K value) increases. But when a critical point is reached, the distortion drops sharply and afterward slowly. This tipping point, known as the "elbow", can be considered the point of the best clustering performance. The maximum number of iterations is set to 50. The convergence criterion is 0, and The initial cluster centers are not disturbed in the k-means clustering analysis.

Results

Participant characteristics

A total of 13,680 adolescents aged 12 to 15 years old in Shanxi Province were surveyed, of which the proportion of complete interview was 88.02%, the proportion of partial interview was 8.11% and that of break-off interview was 3.87%. After removing those who were missing or unrecorded in the clinical examination, the final valid sample consisted of 11,351 individuals. Among them, 11,329 were Han Chinese, accounting for 99.8%. The response rate in this cross-sectional study was 82.98%.

Descriptive statistics

The caries prevalence and associated factors are shown in Table 2. The caries prevalence rate among adolescents aged 12 to 15 in Shanxi is 44.57%. The caries prevalence in males (39.81%) was lower than in females (49.31%). Females aged 12 to 15 in Shanxi, China, are more susceptible to dental caries than males, with statistically significant differences (X2 = 103.59, P < 0.001). The DMFT score in Shanxi Province was 0.98 ± 1.49. In adolescents aged 12 ~ 15 years, the caries rate was positively correlated with age, and the difference was statistically significant (X2 = 25.33, P < 0.01).

Table 2 Dental caries prevalence and associated factors (N = 11,351)

As shown in Fig. 2, There are differences in eating habits among different regions in Shanxi. For example, the dessert consumption frequency in Zuoyun county, Datong, and Yanhu District, Yuncheng, is relatively higher than in other regions. The average prevalence of caries decreased from North to South in Shanxi Province, with the highest prevalence in Xinzhou (1.39) and the lowest prevalence in Yangquan (0.75), showing a significant difference (P < 0.05).

Fig. 2
figure 2

Geographical distribution* of dietary questionnaires (monthly intake frequency) and caries prevalence. (Compiled from the base map provided by Amap: https://lbs.amap.com/)*To clearly show the comparison situation, the schematic scope of the surveyed areas in the figure is expanded to the administrative cities (counties) where it is located

Dietary patterns and caries analysis

Our study used the "elbow" method for determining the number of classifications (Fig. 3). The analysis result is that when the k value is 8 (that is, when the sample is divided into eight categories), it is considered to be the optimal classification number statistically. The cost function is relatively small at this point, and the k value is conducive to the research.

Fig. 3
figure 3

Trend chart of SSE value under different K values

According to the characteristics of individual diet frequency, eight dietary cluster groupings were obtained by the k-means clustering algorithm. Secondarily according to the features of the final cluster center, we named the cluster groupings as eight different diet patterns (Table 3). There were statistical differences among all groups (P < 0.001). For the convenience of description, we name them the following eight categories:

  • 1:High balanced frequency diet. (The eating frequency of all kinds of special food is relatively high.)

  • 2:Low balanced frequency diet. (The eating frequency of all kinds of special food is relatively low.)

  • 3:Non-staple-food-rich diet. (The eating frequency of fruits and vegetables is relatively high)

  • 4:Refreshments-rich diet. (The eating frequency of all kinds of sweet foods is relatively high)

  • 5:Vegetable-rich diet

  • 6:Coarse-grains-rich diet.

  • 7:Limited-refreshments diet. (The eating frequency of other foods except sweets is relatively high)

  • 8:Desserts-rich diet.

Table 3 Distribution of food intake frequency in the dietary patterns (Dietary K-means clustering analysis, final cluster center)

More adolescents were grouped into the “low balanced frequency diet” cluster (n = 4,110), while “high balanced frequency diet” cluster had the fewest individuals (n = 495).

The prevalence of dental caries was the highest among adolescents with the refreshments-rich diet (52.78%), while that of coarse-grains-rich diets was the lowest (42.01%). Adolescents with a higher DMFT score were prone to have dietary patterns of high balance frequency, refreshments-rich diet and desserts-rich diet, while those in coarse-grains-rich diet group had lower DMFT score. In South Shanxi, the caries prevalence among adolescents with the vegetable-rich diet was the lowest, which was merely 37.89%, lower than that in North Shanxi (P < 0.05). The proportion of individuals with limited-refreshments dietary habits in North Shanxi was less than that in other regions, and the DMFT score was higher than that in Middle and South Shanxi (P < 0.05) (Table 4).

Table 4 ANOVA of dietary patterns in different areas

Table 5 shows the results of multiple logistic regression after adjusting for covariates. The results indicate that compared to the low balanced frequency diet, both the high balanced frequency and refreshments-rich dietary patterns could increase the caries risk (P < 0.05). The caries risk in males was 0.691 times that in females (OR:0.69; 95%CI,0.64–0.75).

Table 5 Association between caries prevalence and dietary patterns after adjusting for covariates1

Discussion

Adolescent aged 12 to 15 years are an intensely vital group for being studied in epidemiological surveys of caries. At this age, all their permanent teeth (excluding the third molars) have erupted, and they begin to independent determine their own diet and oral hygiene [34]. Therefore, for this cross-sectional survey conducted in 2018, adolecsents aged 12 to 15 were selected as the subjects for both a questionnaire and oral health examination.

The prevalence of dental caries in mainland China has generally high from 1980s (52.0%,95% CI:49.4%-54.6%)to 2010s (53.1%,95% CI:50.8%-55.5%), showing an increasing trend over the past 38 years [35]. The fourth oral epidemiological survey in China showed that the caries rate of permanent teeth among adolescents aged 12–15 years was 41.9%, and the DMFT score was 1.04 [36]. In our study, the caries rate of permanent teeth in Shanxi adolescents aged 12–15 years was significantly higher than the national average (P < 0.05) and lower than that of Hainan Province (60.5%) [37] and Liaoning Province (53.2%) [38]. The caries prevalence in 12–15-year-old males in Shanxi was relatively lower than that in females (P < 0.001), which was consistent with the Xiao [39] in Jiangsu Province. The result may be because female eat snacks more frequently than male. Additionally, our study revealed significant spatial heterogeneity in the prevalence of dental caries in different areas of Shanxi, consistent with latest study [40].

Caries was affected by many risk factors, among which dietary factors should not be ignored [41]. Our study found significant differences in the intake frequency habits of different foods in different regions. In this study, we divided the dietary types of adolescents aged 12 to 15 into eight categories through cluster analysis. According to their frequency distribution characteristics, we named eight different dietary patterns. The refreshment-rich diet has a higher OR value than the desserts-rich diet, which may be because the dietary patterns with a significantly higher frequency of sugary foods may have contributed to the increased caries risk. This conclusion is consistent with earlier studies [15, 17, 19]. A higher consumption of high-fibre grain products was associated with fewer caries [42]. Our study found that a diet rich in coarse grains seemed to slightly reduce the caries risk (OR:0.90;95%CI,0.79–0.97), which is consistent with the findings of Dye [21] and Thornley [17]. After adjusting for confounding factors, the risk of caries in the "vegetable-rich diet" group (OR:0.95;95%CI,0.86–1.06) was insignificant at a 95% confidence interval. This result may be due to the relatively small sample size of the study or the fact that the frequency of vegetable intake had a weak effect on caries risk [20]. Non-staple-food-rich dietary pattern has no significant difference compared with the low-balanced frequency diet. However, Guo's research shows that individuals with high caries activity have a low intake of fruits, vegetables, and fiber [43]. Adolescents with the high-frequency balanced diet had a higher risk of caries than those with a low-balanced frequency diet. This may be because the ability of sugary foods to increase the risk of caries was much greater than that of coarse grains and vegetables to reduce the risk of caries.

Oral health is influenced by social determinants of health (SDH), which predispose individuals and communities to greater risk of developing caries [44]. In addition to the age and sex, factors such as agriculture/food production, living environment, and family conditions also contribute to the SDH of dental caries. Our study found that the proportion of "coarse-grains-rich diet" cluster in South Shanxi is similar to that in North Shanxi. (P < 0.05) However, the caries prevalence and DMFT score of "coarse-grains-rich diet" cluster in South Shanxi were significantly lower than those in North Shanxi. The per capita meat output is higher in the south of Shanxi Province, resulting in a relatively high proportion of protein foods. Conversely, the proportion of "limited-refreshments diet" and "vegetable-rich diet" patterns in North Shanxi was relatively low, corresponding to the region's high caries rate and DMFT score. The climate in North Shanxi is fairly cold, with less precipitation. The planting area of sorghum, millet, corn, and other crops is rather large, while the corresponding output of vegetables and fruits was relatively small [45]. It may indicate that a higher frequency of refreshments, a low frequency of vegetable and fruit intake are essential reasons for the higher caries risk in North Shanxi. The caries prevalence in Middle Shanxi is slightly higher than in South Shanxi. Therefore, we speculate that the local per capita output of various foods might be one of the reasons for the difference of dietary patterns and caries experience. Additionally, adolescents who grow up in one-child families have a lower caries risk than those who grow up in families with more than one child. Kateeb [46] suggests that improving the socio-economic conditions would give parents more control over their children’s oral health and minimize the level of the disease, which also emphasizes the crucial influence of SDH.

Preventing adolescent oral diseases through improved dietary patterns, not only in Shanxi Province, but worldwide, requires the positive influence of individuals and social factors. The specific suggestion is that the government, school canteens and media outlets should advocate for a diet pattern with low caries risk. The government and society should encourage the reduction of the caries risk among adolescents, including but not limited to popularize the causes of caries occurrence and development, advocate for a healthy diet with low sugar content, implementing a tax on sugar-sweetened beverages (SSBs), and control the proportion of various nutritious foods in school canteens.

Strength and limitations of the study

This cross-sectional survey represents the largest survey of oral diseases in Shanxi Province in the twenty-first century. Using cluster analysis, we investigated the association between dietary patterns and the risk of caries. This method is rarely used in studies examining dietary patterns and dental caries. It has the advantage of comparing individuals' dietary patterns as a whole rather than just focus on individual foods, and it produces clear and easily understandable results [25, 31, 47].

However, our study still has some limitations. First, although it is internationally recognized that eating frequency is more important than food intake for dental caries research, we only investigate dietary frequency and ignored the effect of nutritional intake on caries risk. Second, despite using the scientific sample estimation method to determine the sample size, after the dietary cluster analysis, certain clusters have a small number of individuals, which made it challenging to produce statistical significance in the confidence interval. Additionally, our study was a cross-sectional study. Cohort studies have an unparalleled advantage in exploring the link between diet and dental caries. For example, a prospective study found an association between early-life free sugar intake and the prevalence of dental caries five years later [48]. Therefore, we plan to conduct a cohort study to reveal further the potential link between dietary patterns and dental caries risk.

Conclusions

This study reveals the relationship between dietary patterns and dental caries experience and analyzes several dietary patterns that can reduce or increase the risk of dental caries. Reducing the intake frequency of sugary food (desserts, confectionery and sugary drinks) and increasing the intake frequency of coarse grains can reduce the risk of dental caries in adolescents. Social determinants of health such as sex, family size, and dietary patterns influence the risk of caries. The government, school canteens and news media should take dietary pattern factors seriously.