Background

Stem cells are considered the ‘holy grail’ for therapeutics due to their renewal and regenerative capabilities [1]. Mesenchymal Stromal/Stem Cells (MSCs) fulfill all the necessary requirements as candidates for therapeutics, namely: ability to differentiate multi-directionally, immunomodulatory activity, anti-inflammatory potential, anti-microbial capability, etc. Their translational potential is also aided by the fact that the protocols required to isolate and expand them are relatively less complicated and rapid as compared to protocols for other stem cells [2]. These biological capabilities as well as accessible experimental procedures have made treatment of diseases using MSCs treatment an attractive proposition.

As a consequence, de novo information is constantly being generated from many disease specific studies. In spite of these endeavors, the efficacy of MSCs has not been consistent in these translational studies [3]. This heterogeneity could be due to the fact that a multitude of parameters are required to be decided, by the researcher, in order to conduct a MSCs translational experiment. Some of these like animal models, dosage, time of MSCs infusion, etc. are determined by the researcher while others like physiological variability; immune environments and disease condition, etc. are intrinsic to the host and donor biology. The challenges faced by a researcher in accurately choosing these extrinsic experimental factors and accessing the intrinsic host factors, could ultimately dictate the outcome of these translational studies.

Sepsis is one such condition where the characteristics of MSCs such as their ability to differentiate multi-directionally, modulate the immune system, control inflammation, and anti-microbial properties, are entirely pertinent. Sepsis entails an abnormal host response to a microbial infection and results in a highly dysregulated immune response, hyperinflammation and microbial invasion that could lead to multiorgan failure [4]. As specific and effective therapies against sepsis are still lacking [5], it is an appropriate model for MSCs therapy. Although MSCs have been shown to reduce mortality in septic animal models [6], ambiguity exists due to differences in MSCs treatment procedures like: source of MSCs, dosage and timing of MSCs infusions, host microenvironment etc. [7].

With a goal to suggest optimization criteria for therapeutic interventions in clinical settings, we started by initially scanning the literature for terms related to the role of MSCs in preclinical models and clinical trials. We found that there was a huge variability in the extrinsic experimental factors like dose utilized, timings of infusion post sepsis induction, weight of the model, transplant type, etc. Furthermore, the intrinsic host physiology governed outcomes in terms of cytokine production, factors for organ failure, etc. and this eventually led to survival rate variability in multiple studies. These variable parameters as well as experiment outcomes make it very difficult to decide the exact methodology to achieve consistent therapeutic outcomes.

With the emergence of advanced machine learning techniques, strategies have now been developed to identify and predict the factors that govern the outcome of therapeutic interventions [8]. In order to suggest the rules and associated factors that may help the researchers overcome the challenges of using MSCs as therapeutic agents, we applied machine learning techniques to extract knowledge rules from the published data pertaining to the use of MSCs in sepsis conditions. The graphical representation of the study is depicted in Fig. 1.

Fig. 1
figure 1

Schematic representation of the study. (Created with BioRender.com)

After pre-processing and curation of these data sets, we implemented the machine learning algorithms to recognize the trajectory of outcomes post MSC infusion. We aimed to identify the extrinsic experimental factors as well as the internal host factors and their associated rules that influence the efficacy of MSCs transplantation–thereby dictating the outcome (survival rate in the studies). The generated graph models indicated the survival rates post-infusion of MSCs are influenced by factors like animal model, source, dosage and time of infusion of MSCs, basal level of inflammatory cytokines like IL-6 &TNF-α. This information is particularly useful to predict the clinical outcome of MSCs application in a specific setting and more importantly in designing targeted treatments.

Methods

All the preprocessing, filtering, and machine learning analysis was conducted using Waikato Environment for Knowledge Analysis (WEKA) 3.8.6 machine learning toolkit along with R scripts.

Data acquisition

Systematic search was carried out to collect available data on therapeutic effects of mesenchymal stem cells (MSCs) for sepsis in preclinical models. Public databases such as PubMed, Google Scholar, Litmaps were searched until May 2023. Various terms were used as key words or free text words: “sepsis” and “therapeutic” and “treatment” and “mesenchymal stem cells”. In total, 67 manuscripts were chosen (Supplementary Table 1, Additional file 1).

After extensive search, a number of common parameters were identified to analyse through machine learning techniques. These included: source of MSCs, animal model, weight, dose of MSCs, time of MSCs administration, survival rate, serum levels of different pro-inflammatory cytokines: TNF-α and IL-6 and Liver enzymes: Aspartate aminotransferase (AST) and Alanine transaminase (ALT).

Source of MSCs

The literature gathered included sources like Adipose tissue, Bone Marrow and Umbilical cord. Additionally, compatibility was either allogeneic or xenogeneic depending upon source and host of MSCs.

Animal model

Mice and rats were majorly used as model to study sepsis at preclinical levels. Two studies included pigs as well.

Dose

To treat the sepsis, MSCs were administered intravenously in most of the studies. Dose throughout the study is mentioned as number of cells (in 106) per animal.

Time of administration

Time of infusion of MSCs after establishing septic model differ in each study ranging from 0 to 48 h. Although, in majority of studies, MSCs were injected within 6 h of generating the model.

Survival rate was compared in sepsis models without MSCs infusion and septic model infused with MSCs at 48 h (Supplementary Fig. 1, Additional file 1). Levels of TNF-α, IL-6, AST and ALT, before and after MSCs infusion were noted. Cytokines levels were reported as pg/mL.

Data preprocessing and handling incomplete data

Missing data, prevalent in many clinical and experimental studies, leads to inadequacies while building predictive models [9, 10]. These inconsistencies arise due to experimental design or data acquisition problems. The possibility of missing data is higher in meta-analysis studies, such as ours, where results from multiple studies are pooled together. As a result of experimental design and methodologies followed by different studies our pooled database had missing data, ranging from 15 to 35% in 4 attributes, namely: Survival Rate, IL-6, TNF-α, and Dose.

Multiple studies have reported varied methods that deal with missing data. These methodologies range from extreme measures like removing the missing records, to imputing the missing attribute values with mean or median value. Some methods also employ “filling” in missing data by liner interpolation or extrapolation techniques. As missing data should be processed before conducting any analysis [11], we imputed the missing data for the attributes Survival_Rate by mean and of the immune attributes: IL-6 and TNF-α, by median values. Missing values in the attribute Dose were replaced by the most common dose values corresponding to the animal model used and its respective body_weight.

To maintain consistency in the nomenclature, we renamed the values of the attribute “Animal model” from “mouse” to “mice”. Additionally, the 2 values “Wharton’s jelly” and “Amniotic Fluid” were replaced by “Umbilical cord” to reduce the number of distinct nominal values in the attribute “Source_of_MSCs”.

A single outlier data record where the “Source_of_MSCs” was “Menstrual fluid” was removed.

Due to the fact that classification models require the predictive class attribute to be nominal, we discretized the class attribute “Survival_Rate” by using equal-width binning. This attribute was discretized into 3 bins: “67–100%”, “34–66%” and “0–33%”. Biologically these correspond to “high”, “median” and “low” survival rates.

Selection of input attributes

The procedure to select the most appropriate input properties, that classifies the “Survival_Rate” is challenging for our meta-analysis, as available attributes not only vary across different studies, but there exists large variation in the attribute values. To identify the researcher declared extrinsic experimental factors as well as the internal host factors and their associated rules, we decided to generate two types of models: experimental factors model and immunological factors model, by manually selecting the input properties based on the above 2 criteria. The experimental properties were: Animal_Model, Source of MSCs, Dose, Time_MSC_Infusion. Evidently, variation in the inflammatory environment in the tissues can regulate the immunoregulatory properties of MSC [12, 13].

Therefore, we tried to analyse the correlation between basal levels of the inflammatory cytokines i.e. IL-6, TNF-α before MSC infusion and the survival rate of septic animal models after MSC infusion. Two immunological models were selected: either basal/control TNF-α (Model 1) levels or basal/control IL-6 (Model 2) levels. As these also depends on the experimental properties, these two were taken along with the experimental properties for further analysis. Although AST and ALT levels were also available, we did not employ these for machine learning analysis, as these had large number of missing values (~ 50%).

We employed the “AttributeSelectedClassifier”, with tenfold cross-validation, to select the attributes that had the greatest influence in predicting the “Survival_Rate”. The score “Merit of best subset found” was used to analyze the importance of select pre and post IL-6/TNF-α attributes towards the chosen classification algorithm.

Machine learning methods

We employed a variety of classification approaches that accessed the impact of selected attributes towards predicting the efficacy of MSCs therapy. WEKA-3.8.6 machine learning toolkit along with R scripts were used for implementing and testing the classification models. We initiated the analysis by using decision tree models like j48 and Random Forest (RF) [14], which is an ensemble of decision trees. After obtaining encouraging results we tested the performance of Naïve Bayes models, Support Vector Machines (SVM using radial basis function) and Multi Layered perceptron (Neutral network). Ensemble algorithms like AdaBoostM1 (Boosting), Bagging and stacking were also tested. Logistic regression was also employed and the odds ratio of each attribute was used to access its contribution towards classifying the survival rates.

Decision trees

Although best known for their prediction capability, the true utility of machine learning algorithms on biological data is for knowledge acquisition that unravels patterns amongst various biological attributes. Decision trees are a set of classification algorithms that aid in generating novel knowledge by applying discriminating criteria on graphical representation of biological attributes. One can then use such a graph of conditions to infer possible consequences of a treatment. At the very top of a decision tree is a root node that represents the most important condition for discriminating classes. Lower internal nodes represent additional conditions for class discrimination, whereas the leaf nodes represent the final classification of a biological condition under scrutiny. By following the path from the root node to the leaf node [15], one can learn certain “rules” for the classification of records in a dataset. Therefore, in order to discover the rules (and associated attributes) that influence survival rate we gave experimental attributes and immunological attributes to the J48 decision trees algorithm, a WEKA’s implementation of the C4.5 algorithm [16].

Model performance and comparison

By using tenfold cross-validation we used percentage of correctly classified instances and F-score as measures of model accuracy. In order to compare the different algorithms, we employed “PairedCorrectedTTester” in order to find which of the algorithms performed significantly better or worse when comparing with each other. As a measure of individual classifier performance, we used 2 measures: Percent_correct (performance metric that represents the percentage of instances that are correctly classified by a machine learning algorithm) and F-measure (measure of a classifier’s accuracy that takes into account both precision and recall where a higher value indicates better performance).

Results

Curated database of mesenchymal stromal cells in septic preclinical models

Data preprocessing by the removal of outlier records, renaming of nominal values and imputation of missing values resulted in the selection of 78 records (from 67 published reports) where the septic animal models were injected with MSCs. Post attribute selection, these 78 records had 8 attributes that were chosen as inputs for further analysis. Using these attributes for the dimensionality reduction using attribute selection classifiers further revealed that three attributes: Animal_Model, Source_of_MSCs and Dose_per_Animal were common between both the experimental factors model and both the immunological factors models. For the experimental factors model, one additional attribute: Time, was selected along with the three above (Merit score ranged from 0.83 to 0.89). For the immunological model 1, the one attribute selected was TNF-α levels before infusion (Control_ TNFα) (Merit score of 0.88). The immunological model 2 had the basal levels of IL-6 (Control IL_6 with merit score of 0.92) along with the three common ones above. Thereafter, 4 attributes, for each of the respective models, were used as inputs for classification algorithms. It is noteworthy that these attributes were influencing the Survival_Rate only in the cases where the Animal_Model was mice (Figs. 2, 3, 4). In the models generated, the path from root node to outcome leaf node through the intermediate levels, suggested the following rules that influenced the survival rates (Table 1).

Fig. 2
figure 2

Experimental Factors Model: Decision Tree depicting the dependence of survival rate on the animal model, source, dose and time of infusion of MSCs. The numbers in the braces represent: total number of records/number of misclassified records

Fig. 3
figure 3

Immunological Model 1: Decision tree illustrates the dependence of survival rate on basal levels of TNF-α (Control_TNF-α) in addition to the animal model, source, dose and time of infusion of MSCs. The numbers in the braces represent: total number of records/number of misclassified records. Note: TNF-A stands for TNF- alpha

Fig. 4
figure 4

Immunological Model 2: Decision tree illustrates the dependence of survival rate on basal levels of IL-6 (Control_IL-6) in addition to the animal model, source, dose and time of infusion of MSCs. The numbers in the braces represent: total number of records/number of misclassified records

Table 1 Set of Classification Rules contain21 rules for MSC transplantation experimental design

Classification: decision trees

Extrinsic experimental rules and attributes

Post tenfold cross validation, we analyzed the best representative decision tree for survival rate. As we can observe in Fig. 2, the root node of decision tree is the Animal_model, namely mice, rats, and pigs, which influences survival rate. This root node therefore predicts that only the type of animal is dictating the outcome of the MSCs transplant and that the survival rate is high (above 66%) for both rat (21 records) and pig (2 records) animal models. However, if the animal model is “mice”, there were additional rules that influenced the success of the MSCs infusion. At the second level, the attribute represents the source of MSCs used in mice model: bone marrow, adipose tissue, umbilical cord. Interestingly, adipose tissue-derived MSCs (AD-MSCs) resulted in a high survival rate (above 66%: 13 out of 16 correctly classified records). In case of umbilical cord derived MSCs (UC-MSCs) and bone marrow derived MSCs (BM-MSCs), there were additional rules that influenced the success of the MSCs infusion.

The next level of the decision tree identifies the dose of MSCs administered to the mice as the governing factor towards survival rate. Instances, where the MSCs dose was less than 0.8X106 cells reported a high survival rate (above 66%: 10 out of 12 correctly classified records). However, when the MSCs dose exceeded 0.8X106 cells, the time of infusion of the MSCs influenced the mice survival rate. When time of infusion was less than 0 h, the survival rate was above 66% (3 records) as opposed to survival rate majorly falling into 34–66% bin (7 out of 11 correctly classified records) when the time of infusion is between 0 and 6 h in the case of BM- MSCs infusion. On the other hand, in the case of UC-MSCs infusion, the time range between 2 and 4 h was resulting in a survival rate of 67–100% (4 out of 5 correctly classified records).

These findings highlight the importance of considering the type of animal model, source of MSCs, dosage and time of infusion when designing an experiment for treating animal models of sepsis. Our data driven decision tree models suggest that researchers may increase the likelihood of achieving successful outcomes in their experiments if they give importance to these attributes.

Intrinsic immunological rules and attributes

The models based upon immunological attributes showed a lot of similarity to the model based upon experimental attributes. As in the previous model, the root node of the decision tree is based on the chosen animal model: mice, rats, or pigs. Remarkably, in accordance to both the immunological models, all instances involving rat and pigs resulted in high survival rates (above 66%) for both rat (21 records) and pig (2 records) animal models. However, in the case of mice models, additional factors influenced the success of MSCs infusion. In parallel with above results, AD-MSCs resulted in a high survival rate (above 66%: 13 out of 16 correctly classified records) in mice model. The subsequent branch of the decision tree in immunological model 1 (Fig. 3) reveals the importance of basal levels of TNF-α in mice i.e. levels without MSCs infusion. Mice models with TNF-α levels more than 110 pg/mL had a survival rate more than 66% post BM-MSCs infusion as opposed to low survival rate when TNF-α levels were lower than 110 pg/mL.

Moreover, in immunological model 2 (Fig. 4), basal levels of IL-6 in mice i.e. levels without MSCs infusion, was also observed to determine the success of the MSCs infusion in terms of survival rate. In case of infusion of less than or equal to 0.8 million BM- MSCs per mice, the survival rate was invariably higher than 67% while if the dose exceeded 0.8, the survival rate was dependent upon basal IL-6 levels and when these levels in serum were more than 380, the survival rate was more than 67% (7 out of 11 correctly classified records) in comparison to lower survival rate correlating with lower basal IL-6 levels (2 correctly classified records). Likewise, in the case of UC-MSCs infusion, the basal IL-6 levels more than 900 pg/mL resulted in higher survival rate (4 out of 5 correctly classified records) while the survival rate was between 34 and 66% when basal IL-6 levels were less than 900 pg/mL.

Just like in the experimental model case, these findings emphasize the significance of the type of animal model, transplantation methods, MSCs source, dose and timing of MSCs infusion when designing an experimental study leading to MSCs therapy. Additionally, anti-inflammatory cytokines like IL-6 and TNF-α basal levels should be given careful consideration so that researchers can ensure greater probability of attaining higher survival rates from their infusion experiments.

To summarize, the path from root node to outcome leaf node through the intermediate levels, suggested the following rules that influenced the survival rates in the immunological model (ST-1), (i) Rule 1–3: Applicable to both conditions (Dose, IL-6 and TNF-α), (ii) Rules 4–11 are based on dose of MSC’s and infusion time (iii) Rules 12–16: are determined by Dose and IL-6 levels in the mice. (iv) Rules 17–21 are based on the MSC’s dose and the TNF-α levels.

Comparisons with other machine learning algorithms

We implemented classification models, using the same selected attributes as inputs, by employing a variety of classifiers. The tenfold cross validation revealed that all the classifiers, with the exception of Naïve bayes and Stacking (~ 56% accuracy for both), had high accuracy and most of these algorithms correctly classified the instances with > 70% accuracy (Table 2).

Table 2 The Machine learning classification methods and their performance accuracy as measured by the percentage of correctly classified records and the F-Measure that combines precision and recall.

To find out which of the algorithms were the best performers we applied “PairedCorrectedTTester” to reveal the statistical significance of their performance. Both the Percent_correct and F-measure revealed that j48 decision tree, as well as other algorithms, had high accuracy both in terms of correctly classifies instances as well as F-measure (Table 2).

Discussion

The fact that MSCs act variably in different microenvironments compels us to standardize factors like their dosage, time of infusion and MSCs source to obtain clinically applicable outcomes. Clinical applications of MSCs can only be successful when we are aware of the parameters that influence the efficacy of the infused MSCs. Therefore, it is imperative that, during experimental design, only those factors be emphasized that enhance the efficacy of MSCs. In this study, we identify the attributes that may influence the outcome of MSCs infusion experiments, by applying machine learning techniques. Although MSCs therapy for sepsis has been widely researched over the last decade, the parameters dictating the outcome of the treatment regimen remain unknown. Analysis of the available literature revealed several factors that are dependent upon the decisions taken by the researcher during experimental design. There are also several host specific factors like the immune environment that directly influence the success rate of the MSCs treatment [17]. Our machine learning approach has not only highlighted several of these attributes, but has also generated rules informing us of the hierarchical interactions (between different attributes), as well as the values of these factors that determines the outcome. Here we discuss some of the salient features of these factors.

It is noteworthy that the decision tree algorithms, in the case of mice, highlighted the 3 way hierarchical interplay between source of MSCs, dosage and time of infusion. The rule generated by our analysis suggests that the survival rates are influenced by the source where MSCs are derived from. The analysis revealed that the studies in which septic group of mice received AD-MSCs experienced survival rate greater than 66%. Contrarily, when the tissue source was either bone marrow or umbilical cord, other factors like dose, time of infusion and basal levels of pro-inflammatory cytokines determined the survival rate.

Interestingly, an attribute, according to machine learning decision trees, that does not play a role in MSC efficacy is the transplant compatibility. According to the data, the majority of AD-MSCs were allogenically infused, as compared to BM-MSCs that were allogenic and xenogenic. Whereas, in case of UC-MSCs xenogeneic infusion was majorly reported. This attribute “Transplant type” did not have sufficient support from the decision tree algorithm and is therefore not part of the generated rules. By analysing the effect of dosage on survival rate of septic mice models Li et al. had demonstrated that those mice that were administered medium dose (~ 0.5X106 per mice) displayed highest survival rates [7], whereas low and high doses were detrimental to their survival. Adverse effects of high ADSCs, injected intravitreally, were observed by Rasiah et al. when treating mild traumatic brain injury [18]. Another study by Chen et al. on ischemic stroke rat models, came to a conclusion that a low dose of MSCs infusion displayed the most effective neurobehavioral recovery and infarction reduction [19]. Likewise, our analysis depicted an average dose of 0.8 million per mice to be appropriate and responsible for a higher survival rate post sepsis induction.

The time of administration of MSCs, to treat sepsis in animal models, varied between studies and ranged from 0 to 48 h after the induction of sepsis. These studies have shown that this time range was the most appropriate towards alleviating bacteremia by increasing the survival rate of septic models, decreasing levels of proinflammatory cytokines, and also reducing levels of liver enzymes [20,21,22,23,24,25]. There are also documented relationships between the time of MSCs administration and immune response in terms of proinflammatory, anti-inflammatory and coagulation factors. A biphasic response, in a LPS (Lipopolysaccharide) induced sepsis model, has been observed depending upon the timing of MSCs infusion. The activation or suppression of innate immune pathways was shown to be dependent on early (2 h activation) or late (4 h suppress) LPS induction [26]. Timing of MSCs infusion, in relationship to the effectiveness of the solid organ transplantation, contradicted between different studies [27]. In other diseased conditions like COVID19, early time point MSCs infusion in critically ill patients reduced their mortality [28]. Our decision tree rules also revealed that the timing of MSCs infusion dictates the outcome with MSCs infused in parallel to onset of disease (for BM-MSCs) or in a window between 2 and 6 h of sepsis induction (for UC-MSCs) leads to a relatively higher survival rate. Importantly, these rules indicate the well-established immunomodulatory activity of MSCs, as the mortality rates associated with sepsis can be mitigated by early infusion of MSCs.

Among the literature analysed for mice models in this study, three manuscripts differed wherein the time of infusion of MSCs was greater than 6 h after sepsis induction and still resulted in an improved survival rate ranging between 70 and 85% [24, 25, 29]. It is to be noted that these studies either utilised LPS preconditioning, or the MSCs were injected via retro-orbital route or sequential MSCs dosing was performed post 2 h, 24 h and 48 h of septic model generation. This clearly indicates that we can extend the window timings of MSCs infusion and improve their efficacy through different alterations in the treatment regimen.

An even more compelling observation in the current analysis was that survival rate post MSCs infusion in rats and pigs was higher (66–100%), irrespective of dose or weight. However, data analysis depicted that 98% of these studies, the infusion time was less than 3 h while majorly within 1 h of generation of model. Timing of infusion seemingly played an important role here: i.e.: if the infusion is as soon as the sepsis condition is generated, then the dose does not matter. As we increase the time window, the infusion is dependent upon dose of MSCs and weight of animals especially in case of mice. In addition to the dose/weight and time of MSCs infusion in the study, as mentioned above, internal host factors like the basal levels of pro inflammatory cytokines play a major role in response to MSCs infusion. In the sepsis model, the analysed cytokines are IL-6, TNF-α, IL-1β that indicate proinflammatory environment while IL-10 and TGF-β are indicative of anti-inflammatory environment. The cytokines that influences the survival rate were TNF-α and IL-6 in our study.

Although TNF-α is a proinflammatory cytokine and usually associated as a marker of an inflammatory microenvironment, various studies have proven that preconditioning of MSCs with TNF-α enhances the immunomodulatory capacity of MSCs. TNF-α, alone or in combination with other inflammatory mediators, influences the functional reprogramming of MSCs including higher anti-inflammatory effects [30], increasing their anti-tumour potential [31] and attenuating diseases like colitis [32].

Interestingly, IL-6 has been identified as a marker positively correlated with the severity of sepsis [33]. High levels of IL-6 are shown to be associated to mortality of septic patients and septic models in preclinical studies [34,35,36,37]. On the contrary, when Human-IL-6 was injected in mouse model, 24 h before the induction of sepsis, it promoted the survival of mice. Moreover, IL-6-/- mice had no effect on anti-inflammatory cytokines [38]. On similar lines, Remick et al.,2005 developed IL-6 knockout (KO) model of mice, and observed no difference in mortality in both IL-6 KO and control group of mice post sepsis induction [33]. In another study, IL-6 administration in neonatal mouse model of sepsis resulted in increased survival [39], whereas; total inhibition of IL-6 lowered the survival rate. These studies altogether suggest the variable roles of IL 6 being a pleiotropic cytokine and its role in regards to survival rate is uncertain because of its ubiquity and diverse functions.

It is well known that the anti-inflammatory effects of the MSCs are dependent on the host microenvironment, in vitro preconditioning of MSCs with proinflammatory cytokines is an effective strategy to boost the immunomodulatory activity of MSCs [12, 13]. On the similar note, our analysis has pointed towards a higher survival rate in an inflammatory mileu when basal TNF-α serum levels are above 110 pg/ml that primes BM-MSCs to have better functional capabilities. Basal levels of IL 6 over a threshold value of 380 pg/mL in case of BM-MSCs and 900 pg/mL in case of UC-MSCs also made them act in a better way resulting in improved survival rate. This suggests that by exposing MSCs to a simulated in vivo inflammatory milieu in advance improves their functional capabilities and improvement in survival rate in sepsis preclinical model.

The performance and accuracy of the j48 decision trees, from which these rules have been generated, were compared with 8 other machine learning algorithms. Apart from Naïve Bayes, all the other algorithms had high percentage (> 70%) of correctly classified instances (as well as high F-measure scores: > 0.7) post tenfold cross-validation. This depicts the validity of the rules generated from a machine learning perspective.

Although our study has suggested the attributes that must be considered towards planning a MSCs transplantation experiment, the dependence on data from publicly available studies has inadvertently led to undesirable inadequacies while building the decision tree models. Particularly the prevalence of missing data has forced us to replace the values of some attributes with imputed data. This “artificially generated” data could have contributed to the results being skewed. As the number of data sets were 78, removing all the instances having missing data in any of the attributes would have significantly reduced the sample size and prohibited the application of machine learning techniques. As mentioned earlier, although AST and ALT levels were also available, we did not employ these for machine learning analysis, as these had large number of missing values. This lack of ample data also prohibited the generation of a single unified tree. For example, two separate immunological models were generated i.e. for TNF-α and IL-6. When decision tree was made with both the parameters in a single dataset, the rules generated were biologically non-interpretable and obscure. The limited sample size used in the present study also means that the results were skewed towards the animal that was maximally used in the studies: mice. The other animal models, pigs and rats, were used in a lesser number of studies and detailed hierarchical decision rules could not be generated for these.

Conclusion

From the current data availability, it’s a herculean task to draw an accurate conclusion regarding the optimal conditions for administering MSCs. Our analysis depicts that a threshold level of inflammation is required for MSC driven immunomodulation which is further dependent upon source, dose and timing of MSCs infusion. It is difficult to specifically have a standard protocol as the studies involve multiple models and varied experimental approaches that makes interpretation a difficult task.

Five major points can be highlighted that guide the therapeutic methodologies involving MSCs in accordance to our analysis:

First, survival rate in a diseased model is dependent upon source of tissue from where MSCs are procured. This study suggests that AD-MSCs have better therapeutic efficacy for sepsis.

Second, an inflammatory milieu primes MSCs to perform immunoregulation. High basal cytokines levels: IL-6 and TNF-α in septic animal contributes to high survival rates, particularly in mice.

Third, an important factor is the amount of dose of MSCs. High dose can be detrimental as seen in case of mice. When MSCs were administered less than 0.8 million per mice, it remarkably improved the survival rate.

Fourth, type of animal model apparently influences the survival rate. Higher the hierarchy of the animal model better the survival rate in septic preclinical models.

Lastly, the time of infusion of MSCs markedly alters the survival rate. Through machine learning data it has been deduced that less the time of infusion of MSCs after septic induction higher is the survival rate.

The decision tree-based machine learning method has provided us with options to utilize MSCs in a more efficient manner for modifying real-time treatment options for many diseased states especially sepsis.