A comparison of different machine-learning techniques for the selection of a panel of metabolites allowing early detection of brain tumors

Godlewski, Adrian; Czajkowski, Marcin; Mojsak, Patrycja; Pienkowski, Tomasz; Gosk, Wioleta; Lyson, Tomasz; Mariak, Zenon; Reszec, Joanna; Kondraciuk, Marcin; Kaminski, Karol; Kretowski, Marek; Moniuszko, Marcin; Kretowski, Adam; Ciborowski, Michal

doi:10.1038/s41598-023-38243-1

A comparison of different machine-learning techniques for the selection of a panel of metabolites allowing early detection of brain tumors

Article
Open access
Published: 08 July 2023

Volume 13, article number 11044, (2023)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

A comparison of different machine-learning techniques for the selection of a panel of metabolites allowing early detection of brain tumors

Download PDF

Adrian Godlewski¹,
Marcin Czajkowski²,
Patrycja Mojsak¹,
Tomasz Pienkowski¹,
Wioleta Gosk¹,
Tomasz Lyson³,
Zenon Mariak³,
Joanna Reszec⁴,
Marcin Kondraciuk⁵,
Karol Kaminski⁵,
Marek Kretowski²,
Marcin Moniuszko^6,7,
Adam Kretowski^1,8 &
…
Michal Ciborowski¹

6728 Accesses
9 Citations
4 Altmetric
Explore all metrics

Abstract

Metabolomics combined with machine learning methods (MLMs), is a powerful tool for searching novel diagnostic panels. This study was intended to use targeted plasma metabolomics and advanced MLMs to develop strategies for diagnosing brain tumors. Measurement of 188 metabolites was performed on plasma samples collected from 95 patients with gliomas (grade I–IV), 70 with meningioma, and 71 healthy individuals as a control group. Four predictive models to diagnose glioma were prepared using 10 MLMs and a conventional approach. Based on the cross-validation results of the created models, the F1-scores were calculated, then obtained values were compared. Subsequently, the best algorithm was applied to perform five comparisons involving gliomas, meningiomas, and controls. The best results were obtained using the newly developed hybrid evolutionary heterogeneous decision tree (EvoHDTree) algorithm, which was validated using Leave-One-Out Cross-Validation, resulting in an F1-score for all comparisons in the range of 0.476–0.948 and the area under the ROC curves ranging from 0.660 to 0.873. Brain tumor diagnostic panels were constructed with unique metabolites, which reduces the likelihood of misdiagnosis. This study proposes a novel interdisciplinary method for brain tumor diagnosis based on metabolomics and EvoHDTree, exhibiting significant predictive coefficients.

Plasma metabolite profiles identify pediatric medulloblastoma and other brain cancer

Article 12 November 2022

Brain Tumor Typing and Therapy Using Combined Ex Vivo Magnetic Resonance Spectroscopy and Molecular Genomics

Novel personalized pathway-based metabolomics models reveal key metabolic pathways for breast cancer diagnosis

Article Open access 31 March 2016

Introduction

Gliomas are one of the most common and debilitating primary tumors of the central nervous system (CNS) and are characterized by high mortality and recurrence rate^1,2. The survival rate depends, among others, on the tumor size and localization, age of manifestation, and molecular genetic factors³. According to the 2021 WHO Classification of Tumors of the Central Nervous System, gliomas are classified by their molecular features and can be divided into four grades based on their malignancy⁴. Thus, CNS WHO grade I and II gliomas are commonly considered low-grade gliomas (LGG), while CNS WHO grade III and IV gliomas are considered high-grade gliomas (HGG). Moreover, within 5–10 years from diagnosis, less aggressive at the time of diagnosis LGG, may eventually transform into HGG, increasing the fatality risk⁵. Despite new surgical and therapeutic techniques, the medium survival rate of patients diagnosed with CNS WHO IV glioma ranges from 12 to 15 months^5,6. Current pre-surgical diagnostic methods may not be sensitive or specific enough to solely depict brain tumor type or grade, contributing to poor prognosis of patients with glioma.

However, brain tumor research should not end on malignant entities solely. The most frequently diagnosed type of primary brain tumor is meningioma (meningeal tumor, MT)⁷. On a completely different origin than glioma, MTs arise from the meninges, a membrane that surrounds the brain and spinal cord. Thus, technically it does not originate from brain cells, but due to its presence may compress or squeeze the adjacent nerves and vessels, it is included in the category of brain tumors⁸. In comparison to gliomas, MTs present a high 5-year survival rate, which is in the range from 87 to 97%, depending on people’s age⁹. However, the statement that MT cannot be malignant is false; some benign or atypical MTs can have clinically aggressive behaviors⁴.

Brain tumor cells, as well as other neoplastic diseases, are based on hallmarks of cancer development proposed by Hanahan and Weinberg¹⁰. Mainly these are: the limitless ability to proliferate, escape from cell death through altered apoptotic responses, capacity to stimulate angiogenesis, self-sufficiency in growth stimulating signals, insensitivity to classic growth inhibitory signaling, and migratory and invasive behavior¹⁰. Additionally: genome instability and tumor-promoting inflammation, which both affect cellular energetics and immune evasion¹⁰. Each of the hallmarks eventually leaves its mark on the metabolome, as the metabolites are the final products of every biochemical pathway. In accordance with the Warburg effect, obtaining nutrients for growth through increased intake of lipids, amino acids (AA), and glucose are leading changes in metabolism affecting global metabolome composition¹¹. Consequently, metabolomics could provide information about tumorigenesis and tumor progression by recognizing modified metabolic pathways and altered lipid profiles². Moreover, metabolomics tools are already powerful enough to indicate metabolites that may be useful as a support for glioma diagnosis or distinguishing histological grades and types^12,13,14. In the case of MTs, there is scarce research in metabolomics. However, these few indicate that MT metabolomic signatures can be associated with poor prognosis histological markers, such as Ki-67 or progesterone receptor expression¹⁵. Quantifying in plasma or serum small molecules characteristic to brain tumors would allow for a quicker phenotype determination of cancer in patients, reducing the diagnosis process and hastening the implementation of appropriate treatment¹⁶. Methods based on mass spectrometry allow for simultaneous monitoring of the concentration of many metabolites, which can give a basis for the development of a highly sensitive and specific diagnostic test¹⁷. However, primarily, the selection of an appropriate panel of metabolites characteristic of the relevant neoplastic disease is necessary.

Recently, metabolomics techniques are combined with machine learning methods (MLMs), which are used for the preparation of diagnostic or prognostic panels of biomarkers. This trend is an important alternative to the standard statistical methods such as partial least squares regression, which can only be applied to model linear latent covariance¹⁸. However, biological data are usually non-linear¹⁹, requiring complex computational algorithms. Non-linear machine learning methods such as random forest (RF) and kernel support vector machine (SVM) may be better suited for extensive amounts of metabolomic data^12,20. MLMs have already been applied to develop diagnostic methods for such cancers as lung¹², breast²⁰, endometrial²¹, kidney²², oral²³, liver²⁴, and prostate¹⁷. By use of such advanced data analysis techniques it is possible to obtain diagnostic panels with high model effectiveness coefficients. However, the indicated model should be appropriately optimized since its overfitting may lead to falsified results and erroneous conclusions¹⁸. Moreover, most MLMs tend to focus almost exclusively on prediction accuracy (ACC) and propose complex predictive models. Such an approach hinders the process of uncovering new biological understanding and is often an obstacle for mature applications^25,26. In this research, we focus on both complex and simple MLMs: Naive Bayes (NB), Generalized Linear Model (GLM), Logistic Regression (LR), Fast Large Margin (FLM), Deep Learning (DL), Decision Tree (DT), RF, Gradient Boosted Trees (GBT), SVM, and Evolutionary Heterogeneous Decision Tree (EvoHDTree)²⁵. So far most of them have not been used to indicate gliomas diagnostic panels composed of small molecules. Consequently, the main goal of this study was to develop a glioma diagnostic strategy, notably in LGG, using targeted analysis of metabolites by liquid chromatography coupled with tandem mass spectrometry (LC–MS/MS) in combination with MLMs. We focused on elaborating diagnostic panels that allow the diagnosis of the glioma grades and distinguish a malignant tumor from a non-malignant tumor, such as MT. To our knowledge, this is the first study in which ten MLMs and univariate statistics (UVS) were applied to plasma metabolomics data in order to indicate the best panel of markers for glioma diagnosis.

Results

The aim of this study was to identify a panel of metabolites that can be used for a routine diagnosis of brain tumors. In the first part of this study, we applied a conventional statistical approach (UVS followed by ROC analysis) and ten MLMs, including the novel EvoHDTree hybrid algorithm, to analyze obtained metabolomics data. We performed four comparisons: patients with grade I and II glioma (GI–II) vs. Con, patients with grade III glioma (GIII) vs. Con, patients with grade IV glioma (GIV) vs. Con, and glioma patients without grade division (GI–IV) vs. Con. The confusion matrices obtained for all comparisons were used to calculate for each method the qualifier evaluation parameter. Subsequently, the obtained values of AUC, ACC, and F1–score were compared to choose the best predictive method. The benchmark of the methods used is shown in Table 1. The F1-score was used to compare applied MLMs since this factor combines the precision and classifier recall into a single metric.

Table 1 Comparison of the parameters of ROC curves generated with different MLMs.

Full size table

Based on the results presented in Table 1, we can conclude that the highest F1-score for the four comparisons was obtained for the EvoHDTree algorithm (0.714–0.985) and RF (0.454–0.956), respectively. The conventional approach (UVS followed by ROC analysis with SVM) yielded comparable results (F1-score range of 0.578–0.940) to the newly developed hybrid method. The least useful model was created with logistic regression (F1-score range of 0–0.779). ROC analysis based on statistically significant metabolites and EvoHDTree proved to be valuable tools for preparing prediction models. However, considering the results of the GI–II vs. Con comparison, EvoHDTree performed better than the conventional approach. For the EvoHDTree method, the F1-score and ACC values were 0.714 and 0.910, respectively, while for the conventional approach 0.578 and 0.787, respectively. Analysis of the Friedman test results showed statistically significant differences between the algorithms (significance level equal to 0.05) in terms of ACC. According to Dunn’s multiple comparison test, EvoHDTree managed to significantly outperform the other solutions in almost all comparisons. Additionally, as seen in the example of the GI–II vs. Con comparison (Fig. 1), EvoHDTree is an easy-to-understand algorithm, and the obtained results are straightforward to interpret. For this reason, we have chosen the newly developed hybrid algorithm for further analysis.

In the second part of the experiment, we used EvoHDTree to prepare predictive models for the following comparisons: GI–II vs. MT, GIII vs. MT, GIV vs. MT, and Con vs. MT. Obtained models were validated using the cross-validation method and re-validated with the restrictive Leave-One-Out Cross-Validation (LOOCV). A summary of these two validations results is shown in Table 2. As can be seen, obtained values for ACC and F1-score parameters are usually lower when LOOCV validation was performed. It is related to the specificity of this method, which tests each observation separately, not only the test group, as in the case of cross-validation²⁷. Metabolites used by EvoHDTree to develop predictive panels for nine comparisons are presented in Fig. 2. Venn diagrams demonstrate the number of selected metabolites by the EvoHDTree algorithm that are considered in the nine comparisons (Fig. 2A). Metabolites composing diagnostic panels for GI-II vs. Con, GI–II vs. MT, and MT vs. Con comparisons were not overlapping.

Table 2 Comparison of two types of validation for the EvoHDTree algorithm.

Full size table

Finally, using the R programming language, we constructed ROC curves (Fig. 3) for the nine comparisons prepared by EvoHDTree. Summarizing the data collected in Table 2 and shown in Fig. 3 despite the application of LOOCV, the results presented are still characterized by high prediction coefficients. ACC for the nine comparisons ranged from 0.750 to 0.975, and AUC fluctuated from 0.660 to 0.873. In addition, in order to confirm the correct selection of metabolites by EvoHDTree, we performed biochemical pathway analysis using the online tool MetaboAnalyst 5.0. For pathway analysis, we included 45 metabolites (Fig. 2B) extracted from the newly developed hybrid algorithm. We observed changes mainly in four biological pathways (Table S1). These are aminoacyl-tRNA biosynthesis, arginine biosynthesis, alanine, aspartate and glutamate metabolism, and phenylalanine, tyrosine and tryptophan biosynthesis. The overview of pathway analysis is shown in Fig. 4.

Discussion

Malignant gliomas are responsible for the majority of deaths associated with primary brain tumors. However, early diagnosis could improve the survival rate²⁸. In recent years, significant progress has been made in understanding the fundamental metabolic changes related to glioma progression and biology^2,29. Still, a reliable and accurate method for preoperative brain tumor identification has yet to be developed. Based on the literature review, it was confirmed that analysis of changes in blood metabolite profiles could be an attractive approach to discovering valuable novel glioma biomarkers^2,30. It has been proven that targeted metabolomics analysis based on mass spectrometry may become a useful diagnostic platform in clinical practices due to its high sensitivity and effective throughput³¹. Therefore, aiming to improve brain tumors diagnosis, we used a targeted metabolomics approach (AbsoluteIDQ p180 kit), which allows quantification of up to 188 metabolites from 6 compound classes (AAs, biogenic amines, acylcarnitines, lysophosphatidylcholines, phosphatidylcholines (PC), sphingolipids, and sum of hexoses) for metabolic profiling of plasma samples of people with glioma, MT, and Con. However, working with biomedical data generated by high-throughput technology, such as the one used in this study, can be challenging due to its large size as well as enormous dimensionality, and natural diversity^26,32. In this work, MLM was applied to consider all the presented variables during a brain tumor diagnostic strategy development.

Machine learning approaches are becoming of interest to provide actionable knowledge from large data sets generated using LC–MS/MS methods and to improve metabolic profiling endeavors. To the best of our knowledge, this study is the first to compare 10 different supervised MLMs, including the newly developed hybrid method (EvoHDTree), with the conventional approach to determine metabolomics-based prognostic signatures in gliomas. Previously, conventional approaches were widely used in the metabolomics studies of various diseases^13,33. Currently, novel machine learning algorithms are gaining popularity for constructing predictive methods for various types of cancer^{12,17,20,21,22,23,24,34,35,36,37,38,39,40}.

Decision trees are one of the most popular “white box” prediction techniques⁴¹. The success of tree-based approaches can be explained by their effectiveness, ease of interpretation, and extraction of possible diagnostic rules. However, according to recent literature reports, they could not be compatible with current biological data generated by high-throughput technologies due to the enormous dimensionality, experimental noise, and other perturbations^25,32. For this reason, we proposed a new solution, EvoHDTree, combining DT techniques with evolutionary algorithms and the recently developed concept—RXA. This approach performed very well in the case of genomics data²⁵. Therefore, it seemed reasonable to use it to analyze other omics data, namely metabolomics data. This innovative approach made it possible to prepare glioma diagnostic panels with high predictive coefficients.

Comparing the results (Table 1) for the four comparisons (different glioma grades vs. Con) for all the algorithms applied, we concluded that similar results were obtained using EvoHDTree and the conventional approach. Diagnosing a patient with LGG increases the likelihood of a cure before it transforms into HGG and thus significantly increases the chances of survival⁵. For this reason, we focused on the GI–II vs. Con comparison, in which we obtained better results using the new hybrid algorithm. Although the other machine learning methods utilized in this study identified a variety of discriminating metabolites, these methods yielded a considerably larger number of metabolites composing the diagnostic panels, which can make interpretation and subsequent application more challenging. A larger pool of discriminative features may initially appear beneficial, but it carries the risk of overfitting. In addition, the EvoHDTree algorithm selectively selected metabolites to construct predictive models to avoid repetition in each comparison (Fig. 2A), thus, we applied this method for the second part of the experiment. The unique composition of metabolites chosen for each comparison increases the possibility of distinguishing gliomas from MT. Notably, its novelty consists in its flexible tree node representation, which involves both classical univariate and bivariate tests inspired by the RXA concept. Furthermore, we improved evolutionary exploration and exploitation by incorporating our knowledge of decision tree induction and RXA methodology and designing more than a dozen specialized variants of recombination operators.

In the second part of the experiment, we used EvoHDTree to perform four comparisons between gliomas and MTs, as well as MT vs. Con. The purpose of this section was to assess whether there is an overlap between the metabolites used to construct the diagnostic panels for glioma and MT. Applying the same metabolites to distinguish brain tumors could introduce a bias and lead to misdiagnosis. Considering this, we have developed panels of metabolites that can distinguish glioma patients from MT subjects. Subsequently, we again validated nine predictive models using the LOOCV method to verify the obtained results. Despite the restrictive validation method employed, the ACC results obtained for the nine comparisons are still characterized by high predictive coefficients falling within the range of 0.750–0.975. LOOCV is widely regarded as an excellent tool to validate MLM properly in studies based on smaller study groups⁴². Niu et al.⁴³ reported that there is no need to divide the dataset into a training set and a test set if the quality of the model is tested using the jackknife test (LOOCV), since the result obtained is a combination of many different independent tests of the dataset. Therefore, LOOCV is increasingly recognized and widely applied by researchers to test the power of prediction methods, despite the drawback of long computation time.

Early glioma detection ensures faster implementation of treatment and thus may contribute to prolonged survival³⁰. Therefore, our study focused on a comparison involving LGG and Con. A diagnostic panel for GI-II vs. Con comparison prepared with the use of the EvoHDTree hybrid algorithm mainly used four metabolites (Fig. 1). These were three AAs (taurine, aspartate, asparagine) and sphingomyelin (SM) C24:1. Recently, differences in the levels of certain AAs in the blood of patients with glioma compared to Con have been demonstrated^44,45. In our study, increased levels of SM C24:1 and asparagine and decreased levels of aspartate and taurine in GI-II vs. Con comparison were observed. According to Jothi et al.⁶, taurine occupied the top-most position in discriminating the grades of gliomas, followed by other AAs such as creatinine and glutamine. In addition, taurine has been considered a potential marker of apoptosis in gliomas⁴⁶. Taurine exhibits antineoplastic and antioxidant properties, but its primary role is osmoregulation⁴⁷. Moreover, taurine is presumed to be a determinant nutritional molecule during the regeneration and development of the central nervous system⁴⁸. The decrease in aspartate with glioma grade growth is due to the conversion of this AA to asparagine using asparagine synthetase. Asparagine, as Thomas et al.⁴⁹ proposed, is a crucial factor in brain tumor growth under nutrient-deprived conditions. In parallel to AA metabolism, our study also highlighted the role of lipids in this disease. In our study, SM C24:1 was positively correlated with tumor aggressiveness due to increasing mean concentration values of this lipid in subsequent glioma vs. Con comparisons. Based on a literature review, further tumor growth after the initiation of tumorigenesis is possible due to the evasion of effector cells, which is enabled through an increase in SM concentration in the cell surface membrane. Partial inhibition of the SM conversion to ceramide, an essential signaling molecule for tumor biology, cell proliferation, apoptosis, aging, and cell migration, facilitates tumor progression^50,51,52.

Subsequent comparisons regarding HGG and Con prepared by the EvoHDTree algorithm were based on seven metabolites. For GIII vs. Con, these were kynurenine, creatinine, taurine, methionine, and PCs such as PC ae C44:6, PC aa C42:0, PC ae C38:5. Panels for the GIV vs. Con comparison were built using methionine, creatinine, phenylalanine, asymmetric dimethylarginine (ADMA), PC ae C32:1, PC aa C42:6, lysoPC a C18:0. In our study, upregulation of ADMA, phenylalanine, methionine, and almost all lipids and downregulation of PC aa C42:6, lysoPC a C18:0, kynurenine, and creatinine were observed in comparisons of HGG vs. Con. Du et al.⁵³ demonstrated that the Indoleamine 2,3-dioxygenase 1/tryptophan 2,3-dioxygenase signaling pathway accounted for kynurenine release may regulate the expression of aquaporin 4, promoting motility of glioma cells. Additionally, Samanic et al.⁵⁴ reported that in gliomas, the tryptophan/kynurenine ratio was positively correlated with the pathologic grades, which emphasized the perturbation in the kynurenine pathway in gliomas. ADMA, however, is involved in the dimethylarginine dimethylaminohydrolase/ADMA/nitric oxide pathway. Perturbation of this pathway can result in increased local availability of nitric oxide, which promotes tumor angiogenesis, as well as growth, invasion, and metastasis⁵⁵. Moreover, Gorynska et al.¹⁶ reported the possibility of using solid-phase microextraction during metabolomic phenotyping of gliomas and proved the evidence for disruption of the phenylalanine metabolism pathway. Gorynska et al.¹⁶ found also that methionine disruption can be correlated with gliomas harboring 1p19q codeletion. Tumor-initiating cells in heterogeneous tumors exhibit increased methionine cycle activity driven by increased methionine adenosyltransferase 2A, which converts methionine to S-adenosylmethionine⁵⁶. Creatine has been shown to be the sole precursor of creatinine. During an irreversible non-enzymatic reaction, creatine is converted to creatinine, which is excreted by the kidneys with the urine⁵⁷. The decrease in creatine was observed in a study by Kinoshita et al.⁵⁸ where they used nuclear magnetic resonance spectroscopy to compare brain tumor sections to normal cortex. Downregulation of creatinine levels in gliomas compared to Con may be associated with malnutrition or muscle atrophy, as it was presented by das Neves et al.⁵⁹ in patients with non-small-cell lung cancer. Li et al.⁶⁰ in their study show that the levels of some PCs (PC aa C38:4, PC aa C 36:3, PC aa C 38:6) and lysoPC a C18:0 in glioma tissue were higher than in control samples. Our study shows that the concentrations of lysoPC a C18:0 in the examined plasma were similar in GI, GII, GIII, MT, and control samples. However, the concentration of this lysoPC significantly decreased in G4 plasma samples, suggesting an increased accumulation of these lipids in HGG. Interestingly, Li et al.⁶⁰ found an absence of PC aa C36:1 in glioma tissues compared to control brain tissues. In contrast, Yu et al.⁶¹ proved that PC (36:1) showed lower levels in glioma tissues than in parietal lobe tissues. The literature reports include information on changes in the lipidomic profile of glioma concerning glycerolipids, prenol lipids, cholesterol lipids, phospholipids, and sphingolipids. For this reason, altered lipid metabolism may affect the molecular phenotype of glioma⁶⁰.

A diagnostic panel to distinguish MT from Con was prepared using: kynurenine, symmetric dimethylarginine (SDMA), ADMA, phenylalanine, trans-4-hydroxyproline, and phosphatidylcholines. Concentrations of kynurenine, trans-4-hydroxyproline, PC ae C38:6, PC aa C40:2, and PC aa C36:2 were higher in Con plasma than in MT. In contrast, concentrations of SDMA, ADMA, phenylalanine, PC ae C38:5, and PC ae C42:3 were lower in Con. However, to discriminate glioma from MT using EvoHDTree, we developed four diagnostic panels based mainly on lipid compounds (PCs, lysoPCs, and SMs), four AAs (arginine, tryptophan, taurine, and citrulline), and two acylcarnitines (butyrylcarnitine and octadecadienylcarnitine). Few metabolomics studies on MTs have been published. Gorynska et al.¹⁶, in their study of glioma and MT tissues, reported that patients with MTs had higher levels of aspartic acid, lysine, and arginine. Most metabolomics work on MTs has been done using nuclear magnetic resonance spectroscopy^15,62,63,64. Baranovicova et al.⁶³ used RF to build ROC curves to distinguish MT from Con. They used five metabolites for this purpose: creatine, pyruvate, citrate, formate, and glucose. In their paper, Monleon et al.⁶² describe that the metabolic phenotype of MTs with complex karyotypes exhibits standard features of aggressive tumor biochemistry, including increased turnover of membrane metabolites and high glycolytic activity. Decreased levels of ascorbate and glucose and increased lactate levels suggest a greater reliance on anaerobic pyruvate breakdown, indicating a locally hypoxic microenvironment⁶². Moreover, Ijare et al.⁶⁴, in their study, indicated that alanine, glutamine, and glutamate were significantly elevated in MT grade II. They also demonstrated that blocking glutamine metabolism with the GLS1 inhibitor led to a decrease in meningioma cell proliferation. Interestingly, the higher glutamine metabolism observed in MT grade 1 resulted in improved sensitivity to treatment⁶⁴.

Additionally, pathway analysis was performed to better understand small molecules dysregulation, which may be a source of potential specific early disturbances, possibly associated with the development of glioma. Through the pathway analysis we identified four the most important altered metabolic pathways, namely: (1) aminoacyl-tRNA biosynthesis, (2) arginine biosynthesis, (3) alanine, aspartate, and glutamate metabolism, (4) phenylalanine, tyrosine, and tryptophan biosynthesis (Fig. 4). These pathways are involved in the regulation of cell proliferation, survival, differentiation, and angiogenesis. The same biochemical pathways were found perturbed in gliomas in other studies^{1,16,65,66,−67}.

However, this work has some limitations. The small number of LGG patients may have an impact on the validity of the statistical tests. Another potential limitation is the outdated classification of gliomas. In May 2021, WHO published a new tumor classification of the CNS, based on histological features and genetically defined mutation status^4,68. In our experiment, patients were recruited before the publication of the novel WHO classification, thus the diagnosis was performed according to the actual classification at that time. Although promising, the obtained results require validation in a larger cohort of patients of different ethnicities and grouped based on the new classification. A larger cohort would allow more variation of cases to be indicated to algorithms at the learning stage.

In conclusion, this study provides a new strategy for LGG diagnosis using targeted plasma analysis based on LC–MS/MS and the newly developed hybrid EvoHDTree method. Thanks to this innovative approach, it was possible to prepare diagnostic panels with high predictive coefficients. In the future, the hybrid algorithm we applied could be adapted to other cancers apart from gliomas.

Material and methods

Patients and groups

A total of 240 subjects recruited between 2016 and 2021 from the Department of Neurosurgery of the Medical University of Bialystok were included in the experiment. Patients with incomplete clinical information were rejected. Finally, 94 patients with gliomas (GI = 3, GII = 15, GIII = 10, GIV = 66) and 70 with MT were included. Due to the collection date, the histopathological examination of the surgical sections determined the degree of advancement of individual tumors based on the fourth edition of the WHO Classification of Tumors and the Central Nervous System published in 2016⁶⁹. Then, based on body mass index (BMI), gender, age, and the absence of comorbidities and addictions, 71 subjects undergoing routine health testing within the general population-based cohort study—Bialystok PLUS were entered into the experiment as healthy controls (Con). A total of 235 samples were used to develop diagnostic panels. The characteristics of patients and controls are shown in Table 3. The study was approved by the Ethics Committee of the Medical University of Bialystok (No. APK.002.103.2022). Each participant signed informed consent before sample collection. The investigation conforms with the principles outlined in the Declaration of Helsinki.

Table 3 Median clinical parameters of the studied groups.

Full size table

Targeted metabolomic study using LC–MS/MS

Measurement of 188 plasma metabolites was performed using the AbsoluteIDQ p180 kit from Biocrates Life Sciences AG (Innsbruck, Austria) according to the protocol provided by the manufacturer⁷⁰. Briefly, after dissolving the quality control (QC) samples, the calibration standards, and the mixture of internal standards (ISTD) in water, 10 µL of ISTD was added to each well of the 96-well plate. In the following step, plasma samples, QCs, blank samples, and ISTD were added to the appropriate wells in the extraction plate. After evaporation in a vacuum concentrator (SpeedVac Concentrator, Thermo Fisher Scientific, Savant SPD2010), samples were derivatized with a mixture of ethanol, water, pyridine, and phenyl isothiocyanate. Following the evaporation of the reaction mixture, the analytes from the filters were extracted with 5 mM ammonium acetate in methanol. For analysis by LC–MS/MS, 150 µL of the extract was diluted with the same volume of water. However, in the case of flow-injection analysis coupled with tandem mass spectrometry, the extract was diluted in a 1:49 ratio with the solvent supplied with the kit.

Samples were analyzed in a randomized order in three batches using ultrahigh performance liquid chromatography (1290 Infinity II, Agilent Technology, Santa Clara, CA, USA) coupled with a tandem mass spectrometer (6470 Triple Quad LC/MS, Agilent Technologies, Santa Clara, CA, USA). LC–MS/MS was operated in positive polarity in multiple reaction monitoring mode.

Data treatment

Raw spectral data processing, quantification, and normalization were performed using MetIDQ software (Oxygen DB110-3005, Biocrates, Life Science AG, Innsbruck, Austria). Data normalization was performed according to the Biocrates’ kit user manual. The obtained data was combined and filtered accepting only metabolites present in at least 80% of the samples. In such a data matrix, missing values were substituted with half of the limit of detection value for each specific metabolite in each batch. Subsequently, the obtained data matrix containing 138 metabolites was forwarded for MLMs analysis and conventional statistical approach. A diagram showing the workflow is presented in Fig. 5.

Conventional statistical approach

UVS (the Wilcoxon test or the t-test, depending on the data distribution) was performed using the online tool MetaboAnalyst 5.0. The loaded data was not scaled or transformed. Based on statistically significant metabolites, receiver operating characteristic (ROC) curves with SVM as a classification method were prepared to evaluate the ability of these metabolites to classify study groups.

Machine learning methods

Ten classification algorithms were used to prepare binary classifiers, i.e.: NB, GLM, LR, FLM, DL, DT, RF, GBT, SVM, and EvoHDTree. With the exception of the last method, all of the aforementioned algorithms are state-of-the-art MLMs. The IntelliOmics platform⁷¹, was used to prepare and transform the datasets used in the performed experiments. Next, the algorithms were optimized and tested using RapidMiner software, which is one of the most popular and well-established tool in data mining. In the RapidMiner platform, we leveraged the Auto Model module, that, in general, incorporates smart preprocessing steps, which often include handling missing values, outlier treatment, and scaling or transformation, as appropriate for each ML algorithm⁷². EvoHDTree is a new hybrid algorithm in the field of eXplainable Artificial Intelligence, which has until now been used for gene expression data²⁵. It combines the power of evolutionary induced DT with a concept called Relative eXpression Analysis (RXA). Notably, the patterns discovered by EvoHDTree, such as DT and LR, are easy to analyze and interpret.

Each algorithm has its own set of specific parameters that can be tuned to improve the performance of the model. Here are some examples of the specific parameters tested for a few commonly used algorithms:

DT: maximum tree depth and the minimum improvement in splitting;
RF: number of trees and maximum tree depth;
SVM: regularization parameter C and hyperparameter Gamma;
GBT: number of trees, maximum tree depth, learning rate;
FLM: regularization parameter C;
DL: uses the adaptive learning rate option;
EvoHDTree: regularization parameter in the fitness function²⁵.

For each algorithm, an automatic search for the best combination of parameter values was used by iterating over a range of possible values and testing each combination against a performance metric (such as accuracy or AUC) to see which produces the best results. The setup and fine-tuning of the parameters were carried out on a subset of the training dataset and performed using Auto Model⁷².

The LOOCV, a standard technique when the number of samples is relatively low, was used for validation, which was performed on data not pre-divided into training and testing parts. This technique reduces overfitting by training the model on all but one of the data points and then validating the model on the left-out data points. The process of classification was carried out without performing any feature selection beforehand, meaning that all available features or variables in the dataset were used in the model. Presented results show an average score of 100 runs due to the existence of nondeterministic algorithms. Along with the confusion matrix, an area under the curve (AUC) and ROC curve were generated for each solution.

Data availability

The data supporting the findings of this study are available as part of the work and are included in the Supplementary Information.

Abbreviations

AA:: Amino acid
ACC:: Accuracy
ADMA:: Asymmetric dimethylarginine
AUC:: Area under the curve
BMI:: Body mass index
CNS:: Central nervous system
Con:: Healthy control
DL:: Deep Learning
DT:: Decision Tree
EvoHDTree:: Evolutionary Heterogeneous Decision Tree
FLM:: Fast Large Margin
GI–IV:: I–IV grade glioma
GBT:: Gradient Boosted Trees
GLM:: Generalized Linear Model
HGG:: High-grade gliomas
ISTD:: Internal standards
LC–MS/MS:: Liquid chromatography coupled with tandem mass spectrometry
LGG:: Low-grade gliomas
LOOCV:: Leave-One-Out Cross-Validation
LR:: Logistic Regression
MLM:: Machine learning method
MT:: Meningioma
NB:: Naive Bayes
PC:: Phosphatidylcholine
RF:: Random forest
ROC:: Receiver operating characteristic
RXA:: Relative eXpression Analysis
SD:: Standard deviation
SDMA:: Symmetric dimethylarginine
SM:: Sphingomyelin
SVM:: Support vector machine
UVS:: Univariate statistics
WHO:: World Health Organization

References

Zhou, L. et al. Integrated metabolomics and lipidomics analyses reveal metabolic reprogramming in human glioma with IDH1 mutation. J. Proteome Res. 18, 960–969. https://doi.org/10.1021/acs.jproteome.8b00663 (2019).
Article CAS PubMed Google Scholar
Pienkowski, T., Kowalczyk, T., Garcia-Romero, N., Ayuso-Sacido, A. & Ciborowski, M. Proteomics and metabolomics approach in adult and pediatric glioma diagnostics. Biochim. Biophys. Acta Rev. Cancer 1877, 188721. https://doi.org/10.1016/j.bbcan.2022.188721 (2022).
Article CAS PubMed Google Scholar
Walid, M. S. Prognostic factors for long-term survival after glioblastoma. Perm. J. 12, 45–48. https://doi.org/10.7812/TPP/08-027 (2008).
Article PubMed PubMed Central Google Scholar
Louis, D. N. et al. The 2021 WHO classification of tumors of the central nervous system: A summary. Neuro Oncol. 23, 1231–1251. https://doi.org/10.1093/neuonc/noab106 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zong, H., Verhaak, R. G. & Canoll, P. The cellular origin for malignant glioma and prospects for clinical advancements. Expert. Rev. Mol. Diagn. 12, 383–394. https://doi.org/10.1586/erm.12.30 (2012).
Article CAS PubMed PubMed Central Google Scholar
Jothi, J., Janardhanam, V. A. & Krishnaswamy, R. Metabolic variations between low-grade and high-grade gliomas-profiling by. J. Proteome Res. 19, 2483–2490. https://doi.org/10.1021/acs.jproteome.0c00243 (2020).
Article CAS PubMed Google Scholar
Wiemels, J., Wrensch, M. & Claus, E. B. Epidemiology and etiology of meningioma. J. Neurooncol. 99, 307–314. https://doi.org/10.1007/s11060-010-0386-3 (2010).
Article PubMed PubMed Central Google Scholar
Buerki, R. A. et al. An overview of meningiomas. Future Oncol. 14, 2161–2177. https://doi.org/10.2217/fon-2018-0006 (2018).
Article CAS PubMed PubMed Central Google Scholar
Meningioma: Statistics Cancer.Net, Available at: https://www.cancer.net/cancer-types/meningioma/statistics. Accessed 16 November 2022.
Colquhoun, A. Cell biology-metabolic crosstalk in glioma. Int. J. Biochem. Cell Biol. 89, 171–181. https://doi.org/10.1016/j.biocel.2017.05.022 (2017).
Article CAS PubMed Google Scholar
Cuperlovic-Culf, M., Ferguson, D., Culf, A., Morin, P. & Touaibia, M. 1H NMR metabolomics analysis of glioblastoma subtypes: Correlation between metabolomics and gene expression characteristics. J. Biol. Chem. 287, 20164–20175. https://doi.org/10.1074/jbc.M111.337196 (2012).
Article CAS PubMed PubMed Central Google Scholar
Xie, Y. et al. Early lung cancer diagnostic biomarker discovery by machine learning methods. Transl Oncol. 14, 100907. https://doi.org/10.1016/j.tranon.2020.100907 (2021).
Article CAS PubMed Google Scholar
Zhao, H. et al. Metabolomics profiling in plasma samples from glioma patients correlates with tumor phenotypes. Oncotarget 7, 20486–20495. https://doi.org/10.18632/oncotarget.7974 (2016).
Article PubMed PubMed Central Google Scholar
Righi, V. et al. A metabolomic data fusion approach to support gliomas grading. NMR Biomed. 33, e4234. https://doi.org/10.1002/nbm.4234 (2020).
Article PubMed Google Scholar
Bender, L. et al. Metabolomic profile of aggressive meningiomas by using high-resolution magic angle spinning nuclear magnetic resonance. J. Proteome Res. 19, 292–299. https://doi.org/10.1021/acs.jproteome.9b00521 (2020).
Article CAS PubMed Google Scholar
Goryńska, P. Z. et al. Metabolomic phenotyping of gliomas: What can we get with simplified protocol for intact tissue analysis?. Cancers Basel 14, 321. https://doi.org/10.3390/cancers14020312 (2022).
Article CAS Google Scholar
Penney, K. L. et al. Metabolomics of prostate cancer gleason score in tumor tissue and serum. Mol. Cancer Res. 19, 475–484. https://doi.org/10.1158/1541-7786.MCR-20-0548 (2021).
Article CAS PubMed Google Scholar
Mendez, K. M., Reinke, S. N. & Broadhurst, D. I. A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification. Metabolomics 15, 150. https://doi.org/10.1007/s11306-019-1612-4 (2019).
Article CAS PubMed PubMed Central Google Scholar
Francesco, M. et al. Some nonlinear challenges in biology. Nonlinearity 21, 131–147. https://doi.org/10.1088/0951-7715/21/8/t03 (2008).
Article MathSciNet Google Scholar
Alakwaa, F. M., Chaudhary, K. & Garmire, L. X. Deep learning accurately predicts estrogen receptor status in breast cancer metabolomics data. J. Proteome Res. 17, 337–347. https://doi.org/10.1021/acs.jproteome.7b00595 (2018).
Article CAS PubMed Google Scholar
Cheng, S. C. et al. Metabolomic biomarkers in cervicovaginal fluid for detecting endometrial cancer through nuclear magnetic resonance spectroscopy. Metabolomics 15, 146. https://doi.org/10.1007/s11306-019-1609-z (2019).
Article CAS PubMed Google Scholar
Bifarin, O. O. et al. Machine learning-enabled renal cell carcinoma status prediction using multiplatform urine-based metabolomics. J. Proteome Res. 20, 3629–3641. https://doi.org/10.1021/acs.jproteome.1c00213 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kouznetsova, V. L., Li, J., Romm, E. & Tsigelny, I. F. Finding distinctions between oral cancer and periodontitis using saliva metabolites and machine learning. Oral Dis. 27, 484–493. https://doi.org/10.1111/odi.13591 (2021).
Article PubMed Google Scholar
Hershberger, C. E. et al. Salivary metabolites are promising non-invasive biomarkers of hepatocellular carcinoma and chronic liver disease. Liver Cancer Int. 2, 33–44. https://doi.org/10.1002/lci2.25 (2021).
Article CAS PubMed PubMed Central Google Scholar
Czajkowski, M., Jurczuk, K. & Kretowski, M. Accelerated evolutionary induction of heterogeneous decision trees for gene expression-based classification. In Proceedings of the Genetic and Evolutionary Computation Conference (Association for Computing Machinery, Lille, France, 2021). https://doi.org/10.1145/3449639.3459376.
Barros, R. C., Basgalupp, M. P., Freitas, A. A. & de-Carvalho, A. C. P. L. F. Evolutionary design of decision-tree algorithms tailored to microarray gene expression data sets. IEEE Trans. Evol. Comput. 18, 873–892. https://doi.org/10.1109/TEVC.2013.2291813 (2014).
Article Google Scholar
Zhang, Y. et al. Distinguishing rectal cancer from colon cancer based on the support vector machine method and rna-sequencing data. Curr. Med. Sci. 41, 368–374. https://doi.org/10.1007/s11596-021-2356-8 (2021).
Article CAS PubMed Google Scholar
Lin, D. et al. Trends in intracranial glioma incidence and mortality in the United States, 1975–2018. Front. Oncol. 11, 748061. https://doi.org/10.3389/fonc.2021.748061 (2021).
Article PubMed PubMed Central Google Scholar
Pienkowski, T., Kowalczyk, T., Kretowski, A. & Ciborowski, M. A review of gliomas-related proteins. Characteristics of potential biomarkers. Am. J. Cancer Res. 11, 3425–3444 (2021).
CAS PubMed PubMed Central Google Scholar
Rogachev, A. D. et al. Correlation of metabolic profiles of plasma and cerebrospinal fluid of high-grade glioma patients. Metabolites 11, 133. https://doi.org/10.3390/metabo11030133 (2021).
Article CAS PubMed PubMed Central Google Scholar
Pandey, R., Caflisch, L., Lodi, A., Brenner, A. J. & Tiziani, S. Metabolomic signature of brain cancer. Mol. Carcinog. 56, 2355–2371. https://doi.org/10.1002/mc.22694 (2017).
Article CAS PubMed PubMed Central Google Scholar
Rappoport, N. & Shamir, R. Multi-omic and multi-view clustering algorithms: Review and cancer benchmark. Nucleic Acids Res. 46, 10546–10562. https://doi.org/10.1093/nar/gky889 (2018).
Article CAS PubMed PubMed Central Google Scholar
Chen, Y., Li, E. M. & Xu, L. Y. Guide to metabolomics analysis: A bioinformatics workflow. Metabolites 12, 357. https://doi.org/10.3390/metabo12040357 (2022).
Article CAS PubMed PubMed Central Google Scholar
Xiao, Y. et al. Comprehensive metabolomics expands precision medicine for triple-negative breast cancer. Cell Res. 32, 477–490. https://doi.org/10.1038/s41422-022-00614-0 (2022).
Article CAS PubMed PubMed Central Google Scholar
Gal, J. et al. Comparison of unsupervised machine-learning methods to identify metabolomic signatures in patients with localized breast cancer. Comput. Struct. Biotechnol. J 18, 1509–1524. https://doi.org/10.1016/j.csbj.2020.05.021 (2020).
Article CAS PubMed PubMed Central Google Scholar
Li, N. et al. Combination of plasma-based metabolomics and machine learning algorithm provides a novel diagnostic strategy for malignant mesothelioma. Diagn. Basel 11, 1281. https://doi.org/10.3390/diagnostics11071281 (2021).
Article CAS Google Scholar
Huang, L. et al. Machine learning of serum metabolic patterns encodes early-stage lung adenocarcinoma. Nat. Commun. 11, 3556. https://doi.org/10.1038/s41467-020-17347-6 (2020).
Article CAS PubMed PubMed Central Google Scholar
Prade, V. M. et al. The synergism of spatial metabolomics and morphometry improves machine learning-based renal tumour subtype classification. Clin. Transl. Med. 12, e666. https://doi.org/10.1002/ctm2.666 (2022).
Article PubMed PubMed Central Google Scholar
Gupta, A. et al. A non-invasive method for concurrent detection of early-stage women-specific cancers. Sci. Rep. 12, 2301. https://doi.org/10.1038/s41598-022-06274-9 (2022).
Article CAS PubMed PubMed Central Google Scholar
Wang, W., He, Z., Kong, Y., Liu, Z. & Gong, L. GC-MS-based metabolomics reveals new biomarkers to assist the differentiation of prostate cancer and benign prostatic hyperplasia. Clin. Chim. Acta 519, 10–17. https://doi.org/10.1016/j.cca.2021.03.021 (2021).
Article CAS PubMed Google Scholar
Adilkhanova, I., Ngarambe, J. & Yun, G. Y. Recent advances in black box and white-box models for urban heat island prediction: Implications of fusing the two methods. Renew. Sustain. Energy Rev. 165, 112520. https://doi.org/10.1016/j.rser.2022.112520 (2022).
Article Google Scholar
Pontes, T. A., Barbosa, A. D., Silva, R. D., Melo-Junior, M. R. & Silva, R. O. Osteopenia-osteoporosis discrimination in postmenopausal women by 1H NMR-based metabonomics. PLoS ONE 14, e0217348. https://doi.org/10.1371/journal.pone.0217348 (2019).
Article CAS PubMed PubMed Central Google Scholar
Niu, B. et al. Glioma stages prediction based on machine learning algorithm combined with protein-protein interaction networks. Genomics 112, 837–847. https://doi.org/10.1016/j.ygeno.2019.05.024 (2020).
Article CAS PubMed Google Scholar
Shi, Y. et al. Integrative analysis of metabolomic and transcriptomic data reveals metabolic alterations in glioma patients. J. Proteome Res. 20, 2206–2215. https://doi.org/10.1021/acs.jproteome.0c00697 (2021).
Article CAS PubMed Google Scholar
Bobeff, E. J. et al. Plasma amino acids indicate glioblastoma with ATRX loss. Amino Acids 53, 119–132. https://doi.org/10.1007/s00726-020-02931-3 (2021).
Article CAS PubMed Google Scholar
Opstad, K. S., Bell, B. A., Griffiths, J. R. & Howe, F. A. Taurine: A potential marker of apoptosis in gliomas. Br. J. Cancer 100, 789–794. https://doi.org/10.1038/sj.bjc.6604933 (2009).
Article CAS PubMed PubMed Central Google Scholar
Tripathi, P. et al. Delineating metabolic signatures of head and neck squamous cell carcinoma: Phospholipase A2, a potential therapeutic target. Int. J. Biochem. Cell Biol. 44, 1852–1861. https://doi.org/10.1016/j.biocel.2012.06.025 (2012).
Article CAS PubMed PubMed Central Google Scholar
Lima, L., Obregon, F., Cubillos, S., Fazzino, F. & Jaimes, I. Taurine as a micronutrient in development and regeneration of the central nervous system. Nutr. Neurosci. 4, 439–443. https://doi.org/10.1080/1028415x.2001.11747379 (2001).
Article CAS PubMed Google Scholar
Thomas, T. M. et al. Elevated asparagine biosynthesis drives brain tumor stem cell metabolic plasticity and resistance to oxidative stress. Mol. Cancer Res. 19, 1375–1388. https://doi.org/10.1158/1541-7786.MCR-20-0086 (2021).
Article CAS PubMed PubMed Central Google Scholar
Tallima, H., Azzazy, H. M. E. & El Ridi, R. Cell surface sphingomyelin: Key role in cancer initiation, progression, and immune evasion. Lipids Health Dis. 20, 150. https://doi.org/10.1186/s12944-021-01581-y (2021).
Article CAS PubMed PubMed Central Google Scholar
Tea, M. N., Poonnoose, S. I. & Pitson, S. M. Targeting the sphingolipid system as a therapeutic direction for glioblastoma. Cancers (Basel) 12, 111. https://doi.org/10.3390/cancers12010111 (2020).
Article CAS PubMed Google Scholar
Bi, J. et al. Targeting glioblastoma signaling and metabolism with a re-purposed brain-penetrant drug. Cell Rep. 37, 109957. https://doi.org/10.1016/j.celrep.2021.109957 (2021).
Article CAS PubMed PubMed Central Google Scholar
Du, L. et al. Correction: Both IDO1 and TDO contribute to the malignancy of gliomas via the Kyn–AhR–AQP4 signaling pathway. Signal Transduct. Target. Ther. 6, 385. https://doi.org/10.1038/s41392-021-00808-9 (2021).
Article PubMed PubMed Central Google Scholar
Samanic, C. M. et al. A prospective study of pre-diagnostic circulating tryptophan and kynurenine, and the kynurenine/tryptophan ratio and risk of glioma. Cancer Epidemiol. 76, 102075. https://doi.org/10.1016/j.canep.2021.102075 (2022).
Article PubMed Google Scholar
Hulin, J. A. et al. Inhibition of dimethylarginine dimethylaminohydrolase (DDAH) enzymes as an emerging therapeutic strategy to target angiogenesis and vasculogenic mimicry in cancer. Front. Oncol. 9, 1455. https://doi.org/10.3389/fonc.2019.01455 (2019).
Article PubMed Google Scholar
Wang, Z. et al. Methionine is a metabolic dependency of tumor-initiating cells. Nat. Med. 25, 825–837. https://doi.org/10.1038/s41591-019-0423-5 (2019).
Article CAS PubMed Google Scholar
Wyss, M. & Kaddurah-Daouk, R. Creatine and creatinine metabolism. Physiol. Rev. 80, 1107–1213. https://doi.org/10.1152/physrev.2000.80.3.1107 (2000).
Article CAS PubMed Google Scholar
Kinoshita, Y. & Yokota, A. Absolute concentrations of metabolites in human brain tumors using in vitro proton magnetic resonance spectroscopy. NMR Biomed. 10, 2–12. https://doi.org/10.1002/(sici)1099-1492(199701)10:1%3c2::aid-nbm442%3e3.0.co;2-n (1997).
Article CAS PubMed Google Scholar
das-Neves, W., Alves, C. R. R., de-Souza-Borges, A. P. & de-Castro, G. Serum creatinine as a potential biomarker of skeletal muscle atrophy in non-small cell lung cancer patients. Front. Physiol. 12, 625417. https://doi.org/10.3389/fphys.2021.625417 (2021).
Article PubMed PubMed Central Google Scholar
Li, W. et al. Glycerophosphatidylcholine PC(36:1) absence and 3′-phosphoadenylate (pAp) accumulation are hallmarks of the human glioma metabolome. Sci. Rep. 8, 14783. https://doi.org/10.1038/s41598-018-32847-8 (2018).
Article CAS PubMed PubMed Central Google Scholar
Yu, D. et al. Metabolic alterations related to glioma grading based on metabolomics and lipidomics analyses. Metabolites 10, 478. https://doi.org/10.3390/metabo10120478 (2020).
Article CAS PubMed PubMed Central Google Scholar
Monleón, D. et al. Metabolic aggressiveness in benign meningiomas with chromosomal instabilities. Cancer Res. 70, 8426–8434. https://doi.org/10.1158/0008-5472.CAN-10-1498 (2010).
Article CAS PubMed Google Scholar
Baranovičová, E. et al. Metabolomic profiling of blood plasma in patients with primary brain tumours: Basal plasma metabolites correlated with tumour grade and plasma biomarker analysis predicts feasibility of the successful statistical discrimination from healthy subjects—a p. IUBMB Life 71, 1994–2002. https://doi.org/10.1002/iub.2149 (2019).
Article CAS PubMed Google Scholar
Ijare, O. B. et al. Glutamine anaplerosis is required for amino acid biosynthesis in human meningiomas. Neuro Oncol. 24, 556–568. https://doi.org/10.1093/neuonc/noab219 (2022).
Article CAS PubMed Google Scholar
Yamashita, D. et al. Targeting glioma-initiating cells via the tyrosine metabolic pathway. J. Neurosurg. 134, 721–732. https://doi.org/10.3171/2019.11.JNS192028 (2020).
Article PubMed PubMed Central Google Scholar
Firdous, S. et al. Dysregulated alanine as a potential predictive marker of glioma-an insight from untargeted HRMAS-NMR and machine learning data. Metabolites 11, 507. https://doi.org/10.3390/metabo11080507 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lee, J. E. et al. Metabolic profiling of human gliomas assessed with NMR. J. Clin. Neurosci. 68, 275–280. https://doi.org/10.1016/j.jocn.2019.07.078 (2019).
Article CAS PubMed Google Scholar
Osborn, A. G., Louis, D. N., Poussaint, T. Y., Linscott, L. L. & Salzman, K. L. The 2021 World Health Organization classification of tumors of the central nervous system: What neuroradiologists need to know. AJNR Am. J. Neuroradiol. 43, 928–937. https://doi.org/10.3174/ajnr.A7462 (2022).
Article CAS PubMed PubMed Central Google Scholar
Louis, D. N. et al. The 2016 World Health Organization classification of tumors of the central nervous system: A summary. Acta Neuropathol. 131, 803–820. https://doi.org/10.1007/s00401-016-1545-1 (2016).
Article PubMed Google Scholar
Sawicka-Smiarowska, E. et al. Gut microbiome in chronic coronary syndrome patients. J. Clin. Med. 10, 5074. https://doi.org/10.3390/jcm10215074 (2021).
Article CAS PubMed PubMed Central Google Scholar
Reska, D. et al. Integration of solutions and services for multi-omics data analysis towards personalized medicine. Biocybern. Biomed. Eng. 41, 1646–1663. https://doi.org/10.1016/j.bbe.2021.10.005 (2021).
Article Google Scholar
Kotu, V. & Deshpande, B. Predictive Analytics and Data Mining: Concepts and Practice with Rapidminer (Morgan Kaufmann, 2014).
Google Scholar

Download references

Acknowledgements

The research was supported by the funds of the Ministry of Education and Science within the project “Excellence Initiative—Research University”, Subsidy of the Medical University of Bialystok (SUB/1/DN/22/007/4406), and the Polish National Science Centre allocated on the basis of decision 2019/33/B/ST6/02386. The manuscript contains results obtained from samples acquired during the project VAMP financed by The National Centre for Research and Development POIR.04.01.04-00-0052/18. Center for artificial intelligence at the Medical University of Bialystok. Project funded by the Ministry of Health of the Republic of Poland.

Author information

Authors and Affiliations

Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland
Adrian Godlewski, Patrycja Mojsak, Tomasz Pienkowski, Wioleta Gosk, Adam Kretowski & Michal Ciborowski
Faculty of Computer Science, Bialystok University of Technology, Białystok, Poland
Marcin Czajkowski & Marek Kretowski
Department of Neurosurgery, Medical University of Bialystok, Białystok, Poland
Tomasz Lyson & Zenon Mariak
Department of Medical Pathomorphology, Medical University of Bialystok, Białystok, Poland
Joanna Reszec
Department of Population Medicine and Lifestyle Diseases Prevention, Medical University of Bialystok, Białystok, Poland
Marcin Kondraciuk & Karol Kaminski
Department of Regenerative Medicine and Immune Regulation, Medical University of Bialystok, Białystok, Poland
Marcin Moniuszko
Department of Allergology and Internal Medicine, Medical University of Bialystok, Białystok, Poland
Marcin Moniuszko
Department of Endocrinology, Diabetology and Internal Medicine, Medical University of Bialystok, Białystok, Poland
Adam Kretowski

Authors

Adrian Godlewski
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Czajkowski
View author publications
You can also search for this author in PubMed Google Scholar
Patrycja Mojsak
View author publications
You can also search for this author in PubMed Google Scholar
Tomasz Pienkowski
View author publications
You can also search for this author in PubMed Google Scholar
Wioleta Gosk
View author publications
You can also search for this author in PubMed Google Scholar
Tomasz Lyson
View author publications
You can also search for this author in PubMed Google Scholar
Zenon Mariak
View author publications
You can also search for this author in PubMed Google Scholar
Joanna Reszec
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Kondraciuk
View author publications
You can also search for this author in PubMed Google Scholar
Karol Kaminski
View author publications
You can also search for this author in PubMed Google Scholar
Marek Kretowski
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Moniuszko
View author publications
You can also search for this author in PubMed Google Scholar
Adam Kretowski
View author publications
You can also search for this author in PubMed Google Scholar
Michal Ciborowski
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.G., P.M., T.L., Z.M., and M.Ci. planned project conception. T.L., Z.M., J.R., and K.K. provided clinical samples. A.G., P.M. designed the experiments. A.G., W.G. and M.Ko. acquired data. M.Cz. performed all bioinformatics analyses and developed a hybrid algorithm. M.Cz., M.M., and A.G. acquired funding. A.G., P.M. wrote the manuscript. T.P., T.L., Z.M., J.R., M.Ko., K.K., M.Kr., A.K., M.M. and M.Ci. assisted with manuscript editorial correction. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Michal Ciborowski.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Table S1.

Supplementary Table S2.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Godlewski, A., Czajkowski, M., Mojsak, P. et al. A comparison of different machine-learning techniques for the selection of a panel of metabolites allowing early detection of brain tumors. Sci Rep 13, 11044 (2023). https://doi.org/10.1038/s41598-023-38243-1

Download citation

Received: 23 February 2023
Accepted: 05 July 2023
Published: 08 July 2023
DOI: https://doi.org/10.1038/s41598-023-38243-1
Springer Nature Limited

This article is cited by

Dietary factors and their influence on immunotherapy strategies in oncology: a comprehensive review
- Aleksandra Golonko
- Tomasz Pienkowski
- Wlodzimierz Lewandowski
Cell Death & Disease (2024)

A comparison of different machine-learning techniques for the selection of a panel of metabolites allowing early detection of brain tumors

Abstract

Similar content being viewed by others

Plasma metabolite profiles identify pediatric medulloblastoma and other brain cancer

Brain Tumor Typing and Therapy Using Combined Ex Vivo Magnetic Resonance Spectroscopy and Molecular Genomics

Novel personalized pathway-based metabolomics models reveal key metabolic pathways for breast cancer diagnosis

Introduction

Results

Discussion

Material and methods

Patients and groups

Targeted metabolomic study using LC–MS/MS

Data treatment

Conventional statistical approach

Machine learning methods

Data availability

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Table S1.

Supplementary Table S2.

Rights and permissions

About this article

Cite this article

This article is cited by

Dietary factors and their influence on immunotherapy strategies in oncology: a comprehensive review

Navigation

A comparison of different machine-learning techniques for the selection of a panel of metabolites allowing early detection of brain tumors

Abstract

Similar content being viewed by others

Plasma metabolite profiles identify pediatric medulloblastoma and other brain cancer

Brain Tumor Typing and Therapy Using Combined Ex Vivo Magnetic Resonance Spectroscopy and Molecular Genomics

Novel personalized pathway-based metabolomics models reveal key metabolic pathways for breast cancer diagnosis

Introduction

Results

Discussion

Material and methods

Patients and groups

Targeted metabolomic study using LC–MS/MS

Data treatment

Conventional statistical approach

Machine learning methods

Data availability

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Table S1.

Supplementary Table S2.

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Dietary factors and their influence on immunotherapy strategies in oncology: a comprehensive review

Search

Navigation