Background

Colorectal cancer (CRC) is the fourth most commonly diagnosed gastrointestinal (GI) malignancy and the third leading cause of cancer-related death worldwide [1,2,3]. It occurs in 5% of the general population at any given time [1]. According to GLOBOCAN 2018, it is the third common cancer in Iran with 9864 new cases in 2018 [4]. The mean direct medical cost of CRC per patient in Iran is more than 16,000 US dollars [5] and thus, it is estimated that its economic burden will range from 175,000,000 to 250,000,000 US dollars in 2019.

The incidence rate of CRC has increased in both developing and Western countries over the last decades [6,7,8]. The global burden of CRC is expected to increase by 60% to more than 2.2 million new cases and 1.1 million deaths by 2030 [9]. Each year, over 132.000 new cases of CRC are diagnosed in the United States, and approximately fifty thousand patients will pass away from this cancer [10]. The five-year survival rate is above 90% for first stage of the disease [11]. A large number of evidence has revealed that environmental and modifiable factors such as smoking, alcohol, obesity, unhealthful dietary habits, diabetes and physical inactivity have a major impact on the development of CRC [2, 6, 8, 12, 13]. Based on the diabetes country profiles of the World Health Organization (WHO) in 2016, the prevalence rates of physical inactivity, overweight, and obesity in Iran were 31.9, 60.5, and 24.9%, respectively [14]. Generally, unhealthy lifestyles might accounts for up to 70% of CRC etiology [15, 16]. It has been reported that obesity, particularly central obesity is one of the most significant predisposing factors for numerous cancers and chronic diseases [3]. Moreover, it has been shown that obesity is a meaningful contributor to CRC and is considered as a poor prognosis factor in cancer development [11, 17, 18]. On the other hand, losing weight might have desirable effects on the prognosis of the disease [19]. According to the WHO, obesity is defined as a body mass index (BMI): normal weight (BMI: 18.5–24.9 kg/m2), overweight (BMI: 25.0–29.9 kg/m2), obesity (BMI: 30.0–34.9 kg/m2), severe obesity (BMI: 35.0–39.9 kg/m2) and for morbid obesity (BMI ≥40 kg/m2) [1, 2, 20]. Approximately 30% of the American population is classified in the overweight or obese category [21]. Obesity could be associated with obesity-related cancers such as breast, liver, gynaecological, oesophagus, kidney, lung, pancreatic, thyroid, gallbladder and CRC [6, 21]. It initiates different cellular and molecular pathways, which eventually lead to tumor formation. Adipose tissue produces many kinds of hormones and pro-inflammatory cytokines, among them, interleukin 6, tumor necrosing factor-α, leptin and adiponectin provide desirable inflammatory microenvironment conditions for cancerous cells [22, 23]. Current studies have revealed that adipose tissue stimulates proliferation, migration, angiogenesis and oxidative stress induction [21]. In a recent meta-analysis by Dong et al., it was demonstrated that abdominal obesity is highly associated with an increased relative risk of CRC [6].

In addition to obesity, insulin resistance and hyper-insulinaemia, are also associated with CRC [17, 24]. Insulin resistance and insulin response are highly correlated since majority of the insulin-resistant individuals are either in the highest insulin response quartile or the second highest [25]. Besides, numerous epidemiological studies depict that CRC is more prevalent among diabetic patients as compared to non-diabetic ones [26]. Several observations have elucidated that there is an association between diabetes and an elevated incidence ratio of cancer in specific organs such as liver, pancreas, endometrial, breast, bladder, and colon. Aberration in insulin regulation underlies both diabetes and obesity-related tumorigenesis through several signalling pathways such as insulin-like growth factor (IGF)-1 receptors [27, 28].

In the current study, we investigated an association between diagnosis of CRC, obesity and diabetes in the selected group of CRC patients.

Methods

The study population was collected from patients referred to the colonoscopy unit of Reza Radiotherapy and Oncology Centre, Mashhad, Iran from May 2015 to October 2017. Patients with symptoms of colon cancer include changes in bowel movements, rectal bleeding, anemia, losing weight when not in diet, loss of appetite, nausea or vomiting, persistent abdominal discomfort such as cramps, gas or pain.

Patient samples in case group (N = 178) had a diagnosis determined by colonoscopy and confirmed by pathology. Control group (N = 515) were taken from individuals who underwent CRC screening by colonoscopy that was negative for polyps and CRC through the entire colon and rectum.

All the subjects filled out the administered questionnaires before their colonoscopy. The present study was approved by Mashhad University of Medical Sciences (MUMS) ethic committee (approval #940358) confirming that authors obtained consent to publish from the participants. All methods were performed in accordance with the relevant guidelines and regulations. Excluding criteria of the study were patients with previous CRC, positive familial history of adenoma polyposis, inflammatory bowel disease, hereditary CRC and patients with incomplete colonoscopy and documentations. Demographic characteristics, colonoscopy reports, history of drug (opium) and smoking, as well as medical history were all collected. Weight and height were measured and BMI was calculated and patients were subsequently classified according to WHO benchmarks.

Besides, the location, size and number of the polyps were recorded during the colonoscopy. The polyps were classified as conventional adenomas and serrated lesions. The location of lesion was defined as anal, rectum, sigmoid, transverse colon, descending colon, ascending colon, and cecum. Based on histological classification, the two major classes of colorectal polyps were conventional adenomas including tubular, tubulovillous or villous adenomas and serrated lesions including hyperplastic, sessile serrated polyps or traditional serrated adenomas [29]. Histopathological characteristics of polyps were determined by two expert gastroenterology pathologists. Individuals who had a colonoscopy for their first time with no symptoms were considered as “screening colonoscopy participants”. Patients who had previously colonoscopies with polyps removed and admitted for follow-up were so-called “follow-up colonoscopies”. Patients undergoing colonoscopies for symptoms such as abdominal pain or rectal bleeding were defined as “diagnostic colonoscopies” [30].

Statistical analysis

The data were presented as mean and standard deviation (SD). The t-test was conducted for all the variables, which had a parametric distribution in order to compare means in case and control groups. Pearson’s chi-squared test (χ2) was applied to categorize variables. As a result, risk factors were defined between normal colonoscopy and case groups with p-values < 0.05 which was considered statistically significant. The classification process was performed by using multiple classification models that were designed to classify the normal colonoscopy and adenoma positive cases based on important attributes between normal colonoscopy and adenoma positive. The classification methods consisted of decision tree [DT: Decision tree classifier is a rule-based classifier that is the most powerful and popular tool for classification and prediction. A Decision tree is a flowchart like tree structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (terminal node) holds a class label], random forest [RF: Random forests are made of many decision trees. They are ensembles of decision trees, each decision tree created by using a subset of the attributes used to classify a given population. Those decision trees vote on how to classify a given instance of input data, and the random forest bootstraps those votes to choose the best prediction. This is done to prevent overfitting, a common flaw of decision trees], neural network [NN: Neural networks are based on the operation and structure of the human brain. These networks process one record at a time and “learn” by comparing their classification of the record (which as the beginning, is largely arbitrary) with the known actual classification of the record. Neural networks are typically organized in layers. Layers are made up of a number of interconnected ‘nodes’, which contain an ‘activation function’. Patterns are presented to the network via the ‘input layer’, which communicates to one or more ‘hidden layers’ where the actual processing is done via a system of weighted ‘connections’], K-nearest neighbour [Knn: K-nearest neighbour is one of the most popular and most important algorithms. KNN is known to be very simple and easy. KNN is an example-based learning group. This algorithm is also one of the lazy learning techniques. KNN is done by searching for the group of K objects in the closest training data (similar) to objects in new data or data testing. Generally, the Euclidean distance formula is used to define the distance between two training objects and testing] and Support vector machine [SVM: Support vector machine is another popular classification method. Initially SVM map the input vector into a feature space of higher dimensionality and identify the hyperplane that separates the data points into two classes. The marginal distance between the decision hyperplane and the instances that are closest to boundary is maximized. The resulting classifier achieves considerable generalizability and can therefore, be used for the reliable classification of new samples. It is worth noting that probabilistic outputs can also be obtained for SVM. The identified hyperplane can be thought as a decision boundary between the two clusters. Obviously, the existence of a decision boundary allows for the detection of any misclassification produced by the method] [31, 32].

In a nutshell, by using these classification methods based on the important features including age, BMI, diabetes, family history of colon cancer, drug abuse, we can predict the result of colonoscopy (normal colonoscopy- adenoma positive). All the cases were matched to control with regards to age and weight. In this study, the statistical analysis results were analysed with R programming and Waikato Environment for Knowledge Analysis (Weka) Toolkit.

Results

Overall, 693 patients were participated in this study with mean (SD) age of 49.84 (14.63). Almost half of the patients 347 (51%) were females. Approximately 446 (65%) of participants had BMI > 25 kg/m2 and 553 (79.9%) of patients had a positive medical history (Table 1). About 570 (82.2%) of patients had a family history of different cancers. Nearly 90 % of patients were regular smokers and 30 (3.1%) and 53 (7.6%) of subjects had a positive history of alcohol and drug abuse, respectively. The colonoscopy indications were screening 36 (5.2%), follow-up 116 (16.7%) and diagnostic 541(78.1%). The pathological interpretation of colonoscopy biopsies were normal in 515 (74.3%) of cases, tubular adenoma in 92 (13.3%), adenocarcinoma in 30 (4.3%), tubulovillous adenoma in 21 (3%), hyperplastic polyp in 14 (2%), benign polyp in 13 (1.9%), sessile serrated adenoma in 4 (0.6%), villous adenoma in 2 (0.3%) and traditional serrated adenoma in 2 (0.3%) of the patients. Altogether, positive polyps and positive adenomas composed 149 (21.5%) and 115 (16.6%) of the patients, respectively. The adenomas were mostly located in sigmoid 51 (36.7%) and rectum 27 (19.4%) (i.e. left-sided of GI tract) of the patients. Furthermore, Most patients (58.57%) had polyp or tumor in rectum and sigmoid (rectosigmoid) and only 41.42% of cases had polyp or tumor in the rest of colon which was not statistically significant (p value = 0.1615) (Fig. 1). Adenoma with the size ≥1 cm was observed in 101 (87.8%) of participants. Patient and histopathological characteristics were presented in Table 1.

Table 1 Clinicopathological feature of patient
Fig. 1
figure 1

Location of Polyp/ Tumor in colon. Most patients had polyps or tumors in the rectosigmoid although it was not statistically significant in compare to other locations (p value = 0.1615)

There were 515 participants with normal colonoscopy as compared to 115 adenoma positive patients. The mean age of the two groups was significantly higher in adenoma positive group (p value < 0.001). Gender distribution of the groups were not significantly different. Incidence of overweight and/or obesity (BMI > 25 kg/m2) were significantly higher in adenoma positive patients as compared to normal ones (49.9 and 0.9% respectively, p value = 0.04, Table 2). Interestingly, the incidence of positive history of type 1 or type 2 diabetes observed in the adenoma positive group was significantly higher than the control group (19.1 and 10.9%, respectively with p value =0.02). Similarly, the incidence of a positive family history of CRC was dramatically higher in adenoma positive patients compared with normal colonoscopy cases (25.2 and 16.5%, respectively with p value = 0.03). Table 2 describes the association between age, BMI > 25 kg/m2, history of diabetes and family history of CRC with the risk of colon adenoma that Odd ratio for overweight/obesity and family history diseases related to CRC was calculated as 1.86; 95%CI, 1.24–2.82 and 1.76; 95%CI, 1.09–2.83, respectively.

Table 2 Association of potential risk factors between normal colonoscopy and adenoma positive

In the current study, there were 30 patients with adenocarcinoma. Mean age of patients with adenocarcinoma was higher in comparison with normal group (59.2 for adenocarcinoma and 47.5 for normal colonoscopy, p < 0.001). Gender distribution was similar in both groups. Interestingly, the mean BMI of the cancer patient group was lower than the normal group (24.7 kg/m2 for adenocarcinoma and 27.6 kg/m2 for normal group, p = 0.01). Positive history of type 1 or 2 diabetes and colon cancer was not significantly different in these groups as reported in Table 3. By comparing positive lesions (positive adenoma and adenocarcinoma) with the control group for BMI > 25 kg/m2 and diabetes p values were 0.5 and 0.02, respectively.

Table 3 Association of potential risk factors between normal colonoscopy and adenocarcinoma

In this study, several direct and indirect diseases were considered as risk factors of CRC. Direct diseases included anaemia, blood clotting, thyroid disorders, sexually transmitted disorders, type 1 or 2 diabetes, gynaecological diseases, acromegaly, stomach and colon diseases that have direct effect on colon cancer. In contrast, indirect diseases included high blood pressure, high cholesterol, heart and liver diseases that have indirect effect on CRC [11]. The emerged data demonstrated significant association between type 1 or 2 diabetes with the incidence of colon adenoma (DOR = 1.831, 95%CI = 1.058–3.169 p = 0.023) (Table 4).

Table 4 Direct and indirect disease effect on colon adenoma polyp

In this paper, after discovering higher risk of colon adenoma, we assessed the prediction performance of five classification methods (DT, RF, NN, kNN and SVM) towards the discrimination between normal colonoscopy and adenoma positive groups. Classification methods, were used to categorize a set of observations into pre-defined classes based on a set of variables. Classification accuracy and root mean squared error were the main criterions for evaluating the classification and prediction of samples in the test phase. We evaluated the five classification methods on higher risk factors of colon adenoma and normal colonoscopy data. The performance results of five classification methods were presented in Fig. 2. The experimental results for each classification method on higher risk factors between colon adenoma and normal colonoscopy data were more than 82% and less than 0.42 for the percentage of classification accuracy and root mean squared error, respectively. In Fig. 3, the hierarchical structure generated by DT method could be used to classify individuals based on risk factors identified as BMI (kg/m2) ≥25, age (11–85 yr) as well as diabetes, family history of colon cancer and drug abuse, which are binary variables to normal or adenoma positive groups.

Fig. 2
figure 2

Performance of five classifications. It indicates the classification accuracy (a) and the mean squared error (b)

Fig. 3
figure 3

The result of decision tree based on high risk factors (BMI (kg/m2) ≥25, age (11-85 yr), type 1 or 2 diabetes, family history of colon cancer and drug abuse)

Discussion

Recent studies have determined that obesity is a clear potential risk factor for a variety of malignancies. It has been previously showed that approximately 50% of patients with cancer had an abnormally high BMI [33]. In addition to genetic and environmental factors which contribute to CRC development, some studies assumed gender and ethnicity as predisposing factors for CRC [34, 35]. In the current study, the occurrence of obesity measured based on BMI in the adenoma positive patient group was significantly higher than the control group although there was no notable association between obesity and adenocarcinoma.

Obesity could be evaluated through several different solid anthropometric indexes such as BMI, waist circumference (WC) and waist-to-hip ratio (WHR). In a recent study, Wambui et al. showed that WC as compared to BMI was a better predictor for advanced colorectal neoplasia. The study demonstrated that subjects who were overweight at the age of 21 had a higher risk of CRC than individuals with a normal BMI. Thus, they concluded that maintaining an unhealthy BMI and WC might raise the risk of CRC [36]. The WC is a stronger predictor for CRC risk than BMI but this is still controversial and has not been confirmed [6].

It has been revealed that visceral fat (abdominal fat) is associated with insulin impairment and high IGF2 serum level [35]. In accordance with this, a cohort study (a 23-years follow up) by Levi et al. showed that adolescence (male or female) with overweight or obesity condition were prone to colon and rectal cancer [37]. Brenner et al. indicated that the increasing incidence of CRC in younger adults might be associated with a prominent etiological factor which is obesity [38]. Jensen et al. showed that in 257.623 children, the childhood BMI and height were significantly associated with colon cancer. In other words taller and heavier children were prone to colon cancer in compare to normal-ranged participants [39]. Hanyuda et al. found that the association between BMI and CRC risk significantly differs depending on the presence or absence of poorly-differentiated foci. In the absence of poorly-differentiated foci, high BMI was associated with higher risk of CRC [40]. In a study by Shaukat et al., increase BMI was related to long-term colorectal mortality while reduced BMI could modulate the risk of cancer mortality [41]. Dong et al. conducted a meta-analysis study in 12,837 CRC cases. They showed that abdominal obesity was associated with CRC. They found that increased WC and WHR were profoundly associated with risk of CRC [6]. The underlying mechanism leading to cancer is still under investigation. It is assumed that adipose tissue produces different types of hormones and pro-inflammatory cytokines including IL-6, TNF-α, leptin and adiponectin, which could provide desirable micro-environmental inflammatory conditions for cancerous cells [22, 25]. Besides, it was shown that high levels of IL-23 and IL-10 in serum [29, 42] and IL-8 and IL-6 in the microenvironment are associated with progression of CRC [43,44,45]. Recent investigations have highlighted the role of IGF in CRC. IGF1 and IGF2 have been associated with numerous GI cancers [46, 47]. Several studies have elucidated that serum level and loss of imprinting of IGF2 were associated with advanced colorectal adenoma and poor prognosis in advanced stages of CRC, respectively [48,49,50].

In the current study, we also demonstrated an association between colon adenoma and diabetes (type 1 or 2), suggesting that diabetes could be a risk factor for adenoma and not for CRC. In this regard, in a cohort study, diabetes mellitus was not associated with any cancer such as CRC [51, 52]. It appears that diabetes mellitus does not decrease the survival of the CRC patients and CRC does not have a significant impact on glucose level of patients with diabetes mellitus [53, 54]. In a study with 3000 CRC cases which were followed up to 32 years, type 2 diabetes was significantly associated with high risk of CRC in comparison to controls but only among men [55]. In contrast, recent studies emphasize the relationship between diabetes and CRC. He et al. conducted a perspective cohort of 199,143 participants, indicating that there was a significant risk of CRC in diabetic patients as compared to non-diabetics ones [54] and especially those lower than 65 years and non-white people [55]. Consistent with this study, Overbeek et al, in 55,000 patients with type 2 diabetes and 215,000 matched controls demonstrated that both men and women with diabetes had higher chance of developing CRC [56]. This discrepancy between studies is not well explained yet.

This report was a retrospective study, which possesses several limitations such as sampling error, lack of waist circumference and waist-to-hip measurement in order to compare with BMI. For future directions, these measurements could be recorded and patient follow up will also be informative. Besides, multi-center studies could be performed to increase power of the study, and to conduct studies of high scientific quality. Besides the healthcare/policy decision making could benefit from the results of these studies for CRC screening programs.

Conclusion

This report showed that there were significant differences in age distribution and BMI between case and control groups. This report demonstrates a strong association between colon adenoma and positive a history of type 1 and type 2 diabetes, or familial history of colon cancer. We confirmed that both diabetes and obesity (BMI ≥25 kg/m2) increase the risk of precancerous lesions. Therefore, such patients may consider screening for CRC at an earlier age although controversies still exist.