Introduction

The presence of nonunion fractures and enlarged bone deficits is not a rare encounter in everyday clinical practice. Most often, they are the result of bone fractures, trauma, infection, tumor treatment, some rare syndromes, even because of age. The gold standard in the treatment of those deficits is the autologous bone graft, but it does not come without consequences such as morbidity in the donor area, post operative pain and limited amount of bone [1].

The last decades, researchers are proposing the use of stem cells in the treatment of a variety of diseases with great potential. The most used in the clinical practice are the mesenchymal stem cells (MSCs). They are multipotent stem cells, they can be isolated from almost every human tissue and have the ability to differentiate in many cell types, as chondroblast, osteoblasts, adipose tissue cells. Their isolation process is rather easy, and they can be cultivated in great numbers with genomic stability and limited ethical issues [2,3,4].

In clinical practice, there are many cases where the term “mesenchymal stem cell” is used recklessly. The International Society of Cell Therapy has established certain criteria in order to distinguish them from other cell type. The minimum criteria are their ability to adhere to plastic during culture, their ability to differentiate into osteoblasts, chondroblasts and adipose cells, and the expression of certain cell markers (positive in CD105, CD73, CD90 and negative in CD45, CD34, CD14 or CD11b, CD79α or CD 19, HLA-DR) [5].

The therapeutic potential of mesenchymal stem cells is accommodated in the field of tissue engineering in the effort to tissue regeneration. Tissue engineering combines engineering with life sciences in order to create “biological substitutes that restore, maintain or improve the function of tissue” [6, 7]. It deployed in three main aspects, the need of stem cells, osteoinductive molecules, and osteoconductive scaffolds. The three together will provide the necessary elements and supportive environment needed for bone formation [8, 9].It is also proposed that the process should take place in a stable environment, with mechanical stability. These four parameters are the diamond concept, proposed by Giannoudis et al. [10], that is applied in the quest of bone regeneration.

Due to the advancement in the creation of scaffolds, many materials have been proposed, and the plurality of the manufacturing techniques, gave the opportunity for scaffolds with the mechanical properties and micro and macro architecture of choice [11,12,13,14]. The essential criteria for a material to be used as a scaffold is the biocompatibility and biodegradability. They display not only osteoconductive, but also osteoinductive properties, and it is also possible, with the 3D printing technology, the simultaneous printing of cells inside the scaffolds, to gain the best result [15,16,17,18]. Several materials which meet those criteria have been used such as bioceramic materials, with the best osteoinductive properties, or natural polymers which interact with the MSCs in a more physiological manner. In order to enhance both the survival and differentiation of cells, composite biomaterials have been proposed, which combine more than two different materials, increasing the degree of tissue regeneration. However, most of the concepts of stem cells and scaffolds for bone regeneration have been tested in vitro and in animal models [19,20,21,22,23,24].

In the last decade, various systematic reviews or meta-analysis have been published referring to the regeneration of bone deficits with the administration of stem cells or the application of scaffolds both in human and animal. Recently, a network meta-analysis was published, which aimed to investigate the regeneration of periodontal defects in animals, after stem cell application [25]. Its analysis included 60 studies with 5 different types of MSCs. The strongest evidence for bone regeneration was observed when applying periodontal ligament (PDLSCs), bone marrow (BMMSCs) and dental pulp stem cells (DPSCs) on scaffolds compared to single use of scaffold alone. Correlations between the use of different MSC were mainly indirect, so they have less certainty in terms of the effect they produce.

Two recent systematic reviews and meta-analyses refer to fracture healing. In the study of Kaspiris et al., they collected studies using osteoinducing substances such as growth factors, morphogenetic bone proteins (BMP-2 -7) or PRP, as well as application of MSCs. According to their study, the use of MSCs in fractures of long bones does not appear to have affected healing compared to the control group, but neither did it show adverse reactions, including ectopic osteogenesis or malignancy. However, the main research question of the study was about the application of growth factors or cellular therapy in the treatment only of non-unions of long bone fractures, of which only three studies referred to cellular therapy, with or without the addition of a scaffold [26]. Similarly, the study of Yi et al. showed encouraging results from the use of MSCs in fractures, both in animal and human studies. Although the study assesses the administration of stem cells alone, their application seems to be effective in the treatment of bone fractures [27].

Also, the concurrent application of stem cells and scaffolds has been assessed. In two systematic reviews of animal studies, positive effects on bone regeneration were observed when using MSCs in combination with scaffolds. Even more, the addition of growth factors had better results than when not applied [28, 29].

However, due to many different scaffolds and stem cells proposed, the evidence of their effectiveness is scarce. Additionally, most of the studies and systematic reviews about bone regeneration provide small evidence in human subjects. So, the aim of this systematic review is to assess the effectiveness of the use of a combination of mesenchymal stem cells and scaffold in the treatment of bone deficits in humans. Also, we assessed the safety of this treatment and its effect in function and quality of life of patients.

Methods

Registration

The protocol of the current systematic review was conducted in accordance with the PRISMA-P [30] and published in PROSPERO with registration number CRD42022359049.

Eligibility Criteria

The combined application of Stem cells in scaffolds in bone defects is an effective method for bone regeneration in humans.

P (population):

people with bone deficits

I (intervention):

stem cells in scaffolds

C (comparator):

any other therapeutic intervention not involving a combination of stem cells with scaffolding/ absence of a control group;

O (outcome):

Bone regeneration

S (study type):

Clinical studies in humans

Population/Participants

The included studies were about patients with a bone deficit or femoral fracture regardless its position. There was no restriction in age or general health issues.

Interventions

The studies should had at least one group where the intervention consisted of the use of stem cells in scaffolds for the treatment of the bone defect. No restriction in the type of stem cells, scaffolds, or to a certain combination of the two was applied. The cells had to be characterized as stem cells before their application in order to include the study in the systematic review. For mesenchymal stem cells, the proposed by ISCT cell markers were used for the characterization of the cells [5].

Comparators

The included studies could be with or without control groups in order to assess not only the efficacy, but also the safety of the intervention. The intervention in the control group could be the use of a bone graft, stem cells or scaffolds alone, or even no intervention at all.

Outcomes

The main outcome assessed was the healing of the bone defect. That could be assessed with clinical and radiographic measures of the recovery of the defect. If an histological analysis was presented too, it was also assessed. Because it is a rather new treatment, the systematic review aimed to ascertain the safety of the intervention, with the report of adverse events. Also, when available, we assessed measurements of the rehabilitation of function and quality of life of the patients before and after the intervention or the difference between the intervention and control group, regarding the type of defect.

Study Design

We included to our systematic review only clinical trials in humans, including controlled clinical trials and randomized clinical trials. We included only studies of the last 15 years for homogeneity between the studies.

Language

There was no restriction by language.

Information Sources and Search Strategy

The studies were identified by searching electronic databases, such as Pubmed(MEDLINE), Cochrane (CENTRAL), Web of Sciences and the registries Clinical trials.org, WHO International Clinical Trials Registry Platform (ICTRP) (http://apps.who.int/trialsearch/). After the selection of the final studies, a citation list scanning was also conducted. The last search was conducted in 29-9-2022 and the citation list search in the 1-12-2022. The search strategy and the date of the last search are reported in Table 1.

Table 1 Search strategy

Selection Process

The articles retrieved were collected in Mendeley. Then, the Rayyan [31] was used, to facilitate the screening process in first and second level. The selection process was conducted by two independent reviewers (AMT, MT), first by choosing the appropriate articles according to their title and abstract and second, when the articles passed the first screening, by full text screening, according to the inclusion criteria mentioned above. In case of discrepancies, they were resolved by consensus with a senior author (AK).

Data Collection Process

The data of the studies selected were collected in an Excel sheet. The collection sheet form was created in advance, and calibration tests were conducted before starting the review, so any problem was resolved before the beginning of data collection. The data collection was conducted by one reviewer and a second reviewer checked the data. In case of discrepancy, it was solved by the two reviewers through discussion. Except of the main article, any supplementary items or protocol published in a study registry were checked. The data collected included demographic characteristics of the patients, the distribution of the patients in the groups, the outcomes measured, and characteristics about the methodology used by the researchers. Also, any funding information was recorded as well.

Data Items

The following data were extracted: Name of the first author, year of publication, patient characteristics, study design, number of patients, intervention and control therapy, type of scaffold and type of stem cells, cultivation of stem cells, type of defect, adverse events, type of measurement of the healing effect (clinical, radiographic, biopsy), type of quality of life assessment, any funding source, blinding of the researchers or the assessors of the healing effect, follow up time, results of each study, statistical analysis.

For evaluation of bone regeneration, all data were collected, whether they were radiographic or histological evaluation, for safety any adverse reactions reported were recorded as well as pain evaluation, while for the restoration of function and quality of life of patients, results from questionnaires or any other evaluation by the researchers were collected. For each type of measurement of the result, all the different measurements were extracted, for each group and for each time period. For missing data, an attempt was made to find them in other sources such as in their registration in study registries. Since they were not identified, they were left blank or with most of the information that could be found.

Outcomes

Primary outcomes

  • Healing assessment (clinical, radiographic or histological measures)

  • Safety-Adverse events

Secondary outcomes

  • Function- Rehabilitation

  • Quality of life

Risk of Bias in Individual Studies

The quality of the selected studies was assessed based to the Cochrane risk-of-bias tool for randomized trials (RoB 2) [32]. The assessment criteria were the randomization process, the deviations from intended interventions, missing outcome data, measurement of the outcome and the selection of the reported result. For each domain, each article was characterized as of “low”, “some concerns” or “high” risk of bias and then the overall risk of bias judgement was reached. The studies were characterized as “low”, when the study was judged to be at low risk in all domains, as “some concerns”, when the study was judged to raise some concerns in at least one domain, with none of the domains judged as high risk, and “high” risk of bias when it was judged to be at high risk in at least one domain or to have some concerns in multiple domains.

For non-randomized clinical trials ROBINS-I tool (Risk Of Bias in Non-randomized Studies - of Interventions) [33] was used, which is an extension of RoB-2, with the addition of three domains. The first one is the assessing of confounding, which is a pre- intervention prognostic factor which can predict whether a patient receives one or other intervention. In the current setting, the main confounding could be the age of patients, the size of the defect or the time being untreated, certain diseases or a therapeutic treatment which can mediate the bone healing such as the bisphosphonates [34]. The second is the selection of participants into the study and the third is the classification of intervention. The other 4 domains are similar to RoB-2. The judgement was deduced the same way as for RoB-2.

For single arm studies the NIH tool was used (Quality of Before-After (Pre-Post) Studies with no Control group of National Heart Lung and Blood Institute) [35], which is a questionnaire of 12 questions to understand the limitations or issues of bias, characterizing them as good, fair or poor.

The assessments for each study were conducted independently by two researchers (AMT, MT) and in case of discrepancies, they were solved after discussion. For the graphic visualization of the results, the robvis [36] tool was utilized.

Effect Measures

For the assessment of the bone regeneration, the mean difference and the standard deviation were used, either between the two groups in the follow up time or between before and after in the single arm studies. In case of qualitative assessments, they were transformed in standardized mean difference and standard deviation. When the p value was smaller than 0.05, they were assumed as statistically significant.

Data Synthesis

The studies were divided in regard of having or not a control group. When single arm studies had historical studies as control groups, they were categorized with the single arm studies to diminish the bias of the analysis.

Due to heterogeneity between the studies, no meta-analysis was conducted. The results of each study were reported in a table, expressing the mean difference between the intervention and control group. The table presents all the assessments of bone regeneration, in every follow up time. When the study included more than 2 groups, the extra group was characterized as control or intervention regarding the use of stem cells in scaffolds. Also, for the assessment of safety, the adverse events were collected in a qualitative manner, in a table.

The characteristics and the distribution of the interventions were depicted in charts, created in Excel.

To assess the rehabilitation and quality of life of patients, a subgroup analysis was conducted in regard of the type of defect and were presented qualitatively. No heterogeneity test was conducted due to the different study designs and the small number of studies in each type. No sensitivity analysis was conducted.

Reporting Bias Assessment

The publication bias was assessed by the risk of bias tools, in the risk domain due to missing results for each study, and the publication bias was assessed narratively. The conduction of tests (ex. Egger’s test) or the graphical assessment with funnel plots, were thought as inappropriate due to the heterogeneity of the studies, and the assumptions made would be misleading [37].

Certainty Assessment

In order to evaluate the quality of evidence of all outcomes, we will use the Grading of the Recommendations Assessment, Development and Evaluation (GRADE) working group methodology [38]. Their methodology assesses the quality of evidence across five domains which are the risk of bias, consistency, directness, precision and publication bias. To achieve transparency and simplicity, the GRADE system classifies the quality of evidence in one of four levels—high, moderate, low, and very low. The results were presented in a Summary of Findings Table, made online in GRADEpro.

Results

Study Selection

From the study collection process, 10,091 articles were retrieved. After the removal of the duplicates, 8206 articles arose, from which 340 were excluded because they were published before 2007, so finally 7866 articles were assessed in regard their title and then their abstract, whether they met the inclusion criteria. Of them, 145 articles were evaluated according to their full text. Also, a citation search of those 145 articles was conducted and 48 were similar to the research question and were also evaluated in full text. Finally, 14 articles were meeting the inclusion criteria. The rest of the articles were excluded because of they did not use scaffolds or stem cells in combination (No1), they were not applied in bone defects (No2), they were not applied in humans (No3), the studies were conducted before 2007 and were older than 15 years old (No 4), the studies were still in progress (No5). In addition, studies which used mesenchymal stem cells and did not report the cell markers to define the type of cells or used other markers than the certified by the ISCT, or used a combination of stem cells were excluded (No6) because of the high heterogeneity that they would cause. So, studies which did not cultivate the stem cells were also excluded, because they could not specify the cells according to their cell markers (No7). Only clinical setting trials were included, so case reports were excluded (No8). At last, the results of one study were withdrawn. The study selection process is depicted in the flow chart (Fig. 1) and the complete list of the excluded by full text studies is reported in Supplementary Data.

Fig. 1
figure 1

PRISMA flow chart

Study Characteristics

The study characteristics are presented in Table 2. In the systematic review, 14 studies were included. Of them, 4 were randomized controlled studies [39,40,41,42], 5 studies were non-randomized control studies [43,44,45,46,47] and 5 were single arm studies [48,49,50,51,52], where one of them used a former clinical study as a historical control group. In the study selection process, studies with only mesenchymal stem cells were retrieved for the treatment of bone defects in humans, and no studies with embryonic or induced stem cells. In all studies, the application of stem cells in scaffolds led to bone regeneration, with minimum adverse events, mostly relevant to the surgical procedure.

Table 2 Study characteristics

Type of Defect

The included studies were about 6 different type of bone defects. Specifically, 5 studies were about infrabony periodontal defects, 3 about alveolar bone atrophy, 2 about alveolar cleft, 2 about non-union in long bones and other bone defects as in the femoral bone and as a cystic bone defect of the maxilla. The distribution is depicted in the Fig. 2A.

Fig. 2
figure 2

A Type of Defect, B Origin of Mesenchymal Stem Cells. (a-BMMSCs: alveolar Bone Marrow Mesenchymal Stem Cells, ADMSCs: Adipose-derived Mesenchymal Stem Cells, BFPCs: Buccal Fat Pad Stem Cells, BMMSCs: Bone Marrow Mesenchymal Stem Cells, DPMSCs: Dental Pulp Mesenchymal Stem Cells, PDLSCs: Periodontal Ligament Stem Cells)

Mesenchymal Stem Cells

The MSC used can be divided in three categories: MSC from bone marrow, from the iliac crest or the alveolar bone, MSC of dental origin, as from the dental pulp or the periodontal ligament, and MSC of the adipose tissues, from the buccal fat pad or abdomen. The distribution is depicted in the Fig. 2B.

Cultivation

Regarding the MSC origin, different procedures were followed. Additionally, there were differences in the cultivation medium used, the addition of serum or other additives. The number of passages did not surpass the 5 passages and all studies used 105-107 cells. The cultivation characteristics of the studies are presented in Table 3; Figs. 3 and 4A.

Table 3 Cultivation characteristics of studies
Fig. 3
figure 3

A Culture medium, B Serum

Fig. 4
figure 4

A Passage number, B Cell markers

Cell Markers

There are differences between the cell markers studies evaluated, with each research group reporting a different number of them. The surface markers tested were those defined by the ISCT as the minimum required, as well as some additional ones. The tested cell markers in each study are shown in Table 4 and their distribution in Fig. 4B.

Table 4 Cell markers identified per study

Scaffolds

There was a great deal of heterogeneity in the type of scaffold used. The types of these are shown in Fig. 8. All studies used commercially standardized formulations which are used in clinical practice as graft materials, except for the study of Akhlaghi et al. [43] who used lyophilised human amniotic membrane from healthy donors, and the study of Relondo et al. [50] using an autograft of cross-linking of serum albumin-protein and glutaraldehyde (BioMax). In addition, two studies immersed the scaffold-cell complex into an osteogenic medium for 7 and 20–30 days before implantation. The characteristics of the scaffolds are shown in Fig. 5.

Fig. 5
figure 5

A Type of Scaffold, B Materials of Scaffolds used, C Composition of Scaffolds used. B Magenda column: Sum of each type of scaffolds, Blue column: Number of each scaffold. Purple column: Sum of each type of composition, Blue column : Number of each scaffold

No correlation between the MSC origin and the scaffold was detected. However, there was a correlation between the MSC origin and the type of the defect, were the researchers usually preferred to use MSC of origin close to the type of the defect.

Risk of Bias in Studies

Due to the different type of study design of the studies included, 3 different tools were used to assess the risk of bias.

  • RoB-2

Four RCT were included in the systematic review, with 2 of them showing low risk of bias and the other 2 some concerns. Specifically, about the last two, there was concern about the randomization process, which was not reported in detail, and about the selecting reporting. The results are presented in Fig. 6.

Fig. 6
figure 6

Risk of bias ROB-2 per study and per domain

  • ROBINS-I

Five studies were evaluated with the ROBINS-I tool and were characterized as of medium to serious risk of bias. The main issue were the confounding factors, where the researchers did not report the main characteristics of the patients included so it was impossible to assess whether they were considered, showing serious risk, and the other 2 showed medium risk. In the domains 3 and 4, about classification of intervention and deviations from intended interventions, all studies were of low risk of bias, due to the surgical manner of the intervention. In regard of the selection of the reported result, one study was assessed with serious risk of bias, because it did not report the results of the histological assessments, which was mentioned in the “Methods” section of their report. The results are presented in Fig. 7.

Fig. 7
figure 7

Risk of bias ROBINS-I per study and per domain

  • NIH

With the NIH tool, 5 studies were assessed, of which 4 were characterized as fair and 1 as good. The results are reported in the Fig. 8.

Fig. 8
figure 8

Risk of bias NIH tool

To conclude, the studies were mainly characterized with low or medium risk of bias.

Results of Synthesis

Even though the studies investigated the same research question, they differed in their design, the defect type and the physiology of it, the risk of bias, and the assessment method of the bone regeneration. So, no metanalysis was conducted. The results of the main outcomes are presented in Tables 5 and 6.

Table 5 Bone regeneration
Table 6 Adverse events

Main Outcomes

Bone Regeneration

The main outcome assessed was the bone regeneration. Overall, 139 patients were treated with the used of stem cells in scaffold. In all studies, their application was successful, and the results reported were similar, or even better than the control groups, where standard care practices were used as autologous bone graft from the iliac crest or xenografts. However, the advantage of the application of stem cells in scaffold was not detected in statistically significant results in any study. That could be owed to the small sample size, because the studies were of Phase I or II. The results are presented in Table 5.

The reported adverse events are presented in Table 6. In all studies, no serious adverse events were reported, except of the study of Gomez-Barrena et al. [49, 53], where they were thought to be irrelevant to the intervention, and in the study of Sponer et al. [47], where they were due to the complication of the surgical treatment itself.

Secondary Outcomes

Due to the similar outcomes and characteristics of the studies regarding the defect type, it was decided to present the secondary outcomes in a subgroup manner.

Intrabony Periodontal Defects

There were 5 studies treating intrabony defects, 3 of them were RCTs [39,40,41, 46, 51]. Except bone regeneration, they also assessed the typical of periodontal health, pocket depth, clinical attachment level and gingival recession. In all studies an amelioration of the outcomes was detected but with no statistical significance, except of the study of Hernandez-Mondaraz et al. [54] (p < 0.001). Also, the study of Sanchez et al. [46] evaluated the oral health related quality of life and their pleasure of the aesthetic result after the end of the follow-up, with questionnaires. In all the patients an amelioration was reported, with a better advantage in the control group without statistical significance.

Alveolar Bone Atrophy

Three studies were about alveolar bone atrophies. An alveolar bone augmentation was conducted in order to insert dental implants for prosthetic rehabilitation. The insertion of implants was possible in all patients treated, after 4–6 months post surgically. Only the study of Gjerde et al. [48] assessed the osseointegration with Ostell measurement, the function of the dental prosthesis and the satisfaction of patients. All patients were satisfied of the result and would recommend the procedure to others. Also, the Ostell measurement increased with time.

Alveolar Cleft

In one of the two studies treating patients with clefts, in two adult patients they placed dental implants successfully [42]. The other study, which was a single arm study with historical control study, assessed the tooth eruption of teeth. Of the six patients, in two of them the teeth remained impacted, and an orthodontic movement was needed. In the control study, no issue with the teeth eruption was reported [52].

Non-Union

In the study of Ismail et al. pain was assessed with VAS questionnaires and rehabilitation with the addition of two criteria (LEFS (lower extremity functional scale) and DASH (disabilities of the arms, shoulder and hand score)) and expressed in percentage. The pain levels decreased in both groups, but sooner in the control group in the first month. The functional improvement was greater in the intervention group in the first three months (43% functional score than 27% in the control group) with statistical significance, but the difference diminished after the 7th month between the groups. In the multicenter study of Gomez-Barrena et al. [49], the pain levels during loading were assessed with a VAS questionnaire. In was assumed that when pain was lower than 30%, the stabilization of the fracture was achieved. Already in the first 3 months, there was a stabilization in the 87.5% of the patients, 88.9% of patients in 6 months, and in all patients after 12 months.

Other Bone Defects

In the study of Sponer et al. [47], pain and function were assessed through Harris Hip score, where an improvement was observed post-surgery in all 3 groups, with non-statistically significant difference between them.

Risk of Reporting Bias in Synthesis

Most of the included studies had published their protocol in advance in registries, so it was possible to compare the pre-defined plan to the actual reported results. The studies were in general true to their plan, except the study of Akhlaghi and et [43]. , were, even though they report that histological assessment would be made, no results were published in their “Result” section.

Due to high heterogeneity between the studies, no funnel plot or statistical analysis were conducted. However, due to the great number of studies screened, from 5 different search engines and registries and the fact that the studies included are recent and in Phase I or II, we assume that the risk is small.

Certainty of Evidence

Overall, the results show low certainty of evidence, except the quality-of-life which shows very low certainty. Due to the study design, and the inclusion of non-randomized clinical trials, the certainty is lowered by 1 degree. An extra degree was removed due to the imprecision of the results, due to the small sample size. In regard to the quality of life, because the results came from non-randomized and single arm trials, 2 degrees were removed. In the rest of the domains, no serious risk was detected, so no degree was removed. The results are depicted in a Summary of Findings Table (Table 7).

Table 7 Summary of findings table (GRADE)

Discussion

The aim of this systematic review was to gather the studies using stem cells in scaffolds for bone regeneration, and to assess their therapeutic capacity, safety, impact on the restoration of functionality and quality of life of patients. Overall, in all studies, bone regeneration was successful, safe, and function was restored depending on bone defect. However, the reliability of the results is low due to the small sample of patients, so the results should be carefully interpreted. Nevertheless, this is the first evidence of their applicability to human subjects. Their application was safe, with no serious adverse events reported, the processing of stem cells was possible in a reasonable period and in most cases the discomfort of patients was similar to the other tested interventions. Moreover, the tested intervention gave even better results than the biomaterials used in everyday practice, but with no statistical significance. At last, the current systematic review highlighted the issues of heterogeneity between the different studies and promotes the standardization of the processes needed to obtain and apply those products.

The studies retrieved were of Phase I and II, whose main goal is to assess safety and plausibility of the intervention. Furthermore, the sample size of the studies was rather small to observe statistically significant difference between the control and the intervention group. However, those limitations are explained by the legislation that regulates the application of those products, usually called Advanced Therapy Medicinal Products [55]. At first, those products are tested in animal subjects before the application in human, assessing not only safety but also efficacy [56]. Therefore, there is knowledge over their capability for the treatment of the disease before the administration in humans. There are also differences in the clinical stages in human trials testing those products. In Phase I are included patients and not healthy subjects, mostly for ethical reasons, due to the peculiarity of the treatment. In Phase II the efficacy of the administration is tested, and in Phase III the safety and efficacy are validated in long term results [57]. Furthermore, in most cases, the Phases I and II are combined, due to the possibility of the incapable recruitment of the sample size desired [58]. So, due to the small period of their testing, the only available information about their efficacy comes from that type of studies. Except those issues, we encounter a great heterogeneity between the studies, including the different defect types, the origin of the MSC, the type of scaffold used and the assessment of the bone regeneration. By the GRADE assessment, the certainty of evidence was low to very low, due to the design of studies and the small sample size.

However, the current systematic review is the first, according to our knowledge, which estimates the bone regeneration after the use of stem cells in scaffolds. The eligibility criteria were strict enough to include only stem cells which were characterized as those, providing homogeneity of the cells used in the different studies. The study was in accordance with the latest guidelines to conduct a systematic review as PRISMA 2020, PRISMA – P, ROB-2, ROBINS-I and GRADE. The protocol of the study is published in advance in the PROSPERO, and any changes of it were published.

It is the first evidence of the efficacy of mesenchymal stem cells in scaffolds in bone regeneration in humans and gave prominence in the issues of clinical trials and the heterogeneity of the literature.

A great number of studies and systematic reviews have been published which investigate the effect of MSC that have been elaborated with minimally manipulation whole tissue fractions on bone regeneration. These studies differ from the studies included in this systematic review in that cells are not isolated from tissues of origin by specialized techniques but used as whole, with a mixture of cell populations, mesenchymal and non-stem cells, usually in a smaller number, while utilizing their niche to promote bone regeneration [59]. Several different protocols have been published in the literature [60,61,62,63]. Recent systematic reviews show that their application in combination with scaffolds offers improved efficiency compared to the single use of scaffolding. In addition, due to the ease of isolation and reduced cost, they may be a simple alternative [59, 64]. Thus, neither the number nor the population of cells used is clear. It is known that in tissues MSCs are found in small numbers (from 0.01% in bone marrow to 1% in adipose tissue), which is why cell culture is required for their application [63]. So, it is of paramount importance that studies using this type of product clarify these differences.

It was observed that the researchers preferred to use MSC of an origin near to the bone defect being treated. A possible indication of that, except of the knowledge of the anatomy of the area and the easier receipt from an area near to the defect, there are possible healing effects. It is known that MSC gain certain of their characteristics from the origin of their isolation. For example, the umbilical cord MSCs seem to show higher proliferation and differentiation potential than bone marrow or adipose MSC [65]. Also, MSC from different origins express different cell markers and possible differences in their immunomodulatory properties [66]. So, it is proposed that the application of MSC from an origin close to the defect site may enhance the healing of the current tissue in comparison with other MSCs [67, 68].

Apart from the origin of MSC, the cultivation process differs in the studies. First, in regard the origin of MSC, a particular procedure is applied [69]. Certain studies report that the cultivation medium or the addition of serum may influence the characteristics of MSC. For example, in a recent study, the application of a-MEM gave MSC with better osteoinductive characteristics than DMEM [70]. According to serum addition, the latest years, the human origin serums seem to prevail over the bovine serum, mostly due to ethical and economic reasons, but also because some studies indicate that the human serum is safer and promotes the cell proliferation in greater degree [71,72,73,74]. However, these are still indications and greater evidence is needed to establish that knowledge in the clinical practice.

It is well established that MSC are characterized according the ISCT criteria [5]. Most of the studies utilize those cell markers as their main criteria. Nonetheless, the studies included tested other cell markers too. As reported above, the origin of MSC can influence the expression of certain markers, but also the phase of culture or after cell differentiation [4, 70, 73, 75]. The identification of extra markers and the differences of the states noted above could ameliorate the characterization of the MSCs and may be an index of capabilities of the cells.

In addition, during the selection process, some studies came up that did not characterize their cells, resulting in ambiguity as to the type of cells they used. Thus, these studies could not be used as they would introduce bias into the review. It is clear that this heterogeneity creates confusion in the literature and possibly erroneous conclusions about the effectiveness of MSC [76].

Finally, in this systematic review it was observed that most of the scaffolds used were commercially available biomaterials which are used in clinical practice as grafts and their safety and efficacy are known. However, the literature suggests a plethora of scaffolds with composite materials and specialized manufacturing techniques such as three-dimensional printing, which after their application to bone lesions in animals and humans showed increased rates of bone healing, but without a pronounced superior biomaterial [77, 78].

According to recent bibliometric studies, the application of stem cells in scaffolds is an area of great research for the treatment of various diseases and defects [79,80,81]. It is clear that the advance in scaffold fabrication techniques, especially 3D printing, the combination of several materials, the simultaneous implantation of cells inside the scaffold [82], but also the knowledge for the many different stem cells that could be utilized, would provide new solutions in the current issues. Even more, there are several clinical trials in progress that estimate the effect of stem cells in scaffolds for bone regeneration (Table 8). Most of them, are randomized Phase III clinical trials, which will provide more certain evidence about the effects of the intervention in the long term.

Table 8 Studies in progress

Conclusion

The application of mesenchymal stem cells in scaffolds for bone regeneration is a safe intervention, with positive effects, similar to standard care, or with even better results, able to reestablish the functionality and quality of life of patients. However, the evidence of the results is low to very low, due to the small sample size and the design of studies. The following years, with the results of the studies in progress, the prosecution of bigger studies with a better design, and standardization of the processes of stem cell culture and scaffold manufacturing, will give much more evidence in the matter.