Background

Globally, the proportion of the aging population has been dramatically increasing [1]. Osteoarthritis (OA) of the knee is the most common joint disorder among people aged over 60. Among this population, 10% of males and 13% of females have symptomatic knee OA [2]. Total knee arthroplasty (TKA) is the most common surgical option for end-stage knee OA with knee deformity and persistent pain [3]. In the last 20 years (1993–2012), a total of 7.8 million primary TKA procedures were performed in the United States [4]. The number of TKA procedures continues to increase and it is expected to increase 69% between 2012 and 2050 [4]. Currently, TKA using intramedullary and extramedullary alignment systems with cutting guides is the standard of care [5]. During the procedure, numerous surgical devices are utilized to ensure success. However, this creates a complicated workflow and prolonged surgery time [6]. Furthermore, it requires positioning an intramedullary nail in the femoral canal that increases invasiveness.

A new innovative surgical technique using patient-specific instrumentation (PSI) for performing TKA has been developed to reduce the technical difficulties and invasiveness associated with standard TKA. PSI is also called “patient-matched instrumentation,” “custom-fit instrumentation,” or “custom-made instrumentation” [7,8,9]. For TKA using PSI, disposable cutting blocks are generated to fit each patient’s 3-dimensional anatomy of the knee in reference to the preoperative computed tomography (CT) or magnetic resonance imaging (MRI) images combined with radiographs of the lower extremity [10]. The cutting blocks are made individually for each patient and they enable the surgeon to develop a surgical plan specific to each patient. The cutting blocks fit on the distal femur and proximal tibia and guide surgeons to cut them accurately [10].

The TKA procedure using PSI decreases rates of lower limb malalignment [11, 12] and it is expected to improve functional outcomes and decrease revision rates [13,14,15]. In the procedure, PSI does not require positioning intramedullary nails in the femur and is expected to reduce blood loss and transfusion rates. Also, the simpler workflow owing to the cutting blocks potentially reduces surgical time. Long surgical time is one of the important risk factors for postoperative surgical site infection (SSI) [16] and deep venous thrombosis (DVT) [17] related to TKA. Thus, PSI is expected to reduce postoperative SSI and DVT.

Previous systematic reviews (SR) reported that TKA using PSI reduces surgical time, blood loss, and rates of lower limb malalignment as compared to standard TKA [12, 18,19,20,21,22,23,24,25,26]. However, the primary outcomes are surrogate markers and the SRs did not examine whether PSI would contribute to decreasing rates of transfusion, SSI, DVT, and revision TKA as compared to standard TKA. Also, the previous SRs addressed inconsistent results for patient-reported outcome measures (PROMs) with a limited number of RCTs. The aim of this review is to assess the effects of TKA using PSI for patients with osteoarthritis of the knee as compared to standard TKA. The complete PICO-format research questions is shown in Additional file 1: Appendix A.

Methods

Criteria for including studies

We included studies which compared TKA using PSI (TKA-PSI) and standard TKA in this review. Standard TKA was defined as TKA using intramedullary alignment guiding nail for cutting femur and either intramedullary or extramedullary alignment guiding nail for cutting tibia. Studies were included if patients had primary TKA for knee OA classified into grade 3 and 4, according to Kellgren-Lawrence grading system [27]. If patients had rheumatic diseases or if less than 80% of the study’s population had OA, the studies were excluded. Since rheumatic diseases are systematic diseases, controlling for these conditions using medications directly influences clinical outcomes. Types of primary outcome measures are shown below:

  • PROMs such as Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), Oxford knee score (Oxford), Knee Society Score (KSS), and the Knee injury and Osteoarthritis Outcome Score (KOOS), European quality of life 5-dimensions using the visual analogue scale (EQ-5D VAS), 12-item short form health survey physical score (SF-12 physical score), 12-item short form health survey mental score (SF-12 mental score)

  • Transfusion rate

  • Blood loss

  • Surgery time

  • Complications (i.e. SSI, DVT, and revision TKA)

As a secondary outcome measure, we investigated the percentage of alignment outliers from the planned alignment in the included studies in which PROMs were examined.

For patient-reported outcome measures (PROM), surgery time, blood loss, and transfusion rate, we included randomized controlled trials (RCT). For complications (i.e. SSI, DVT, revision TKA), we also included non-randomized comparative studies (non-RCT).

We conducted a comprehensive literature search for all relevant articles using four electronic databases: the Cochrane Central Register of Controlled Trials (CENTRAL) (The Cochrane Library 2019, Issue 2), MEDLINE (1946 to February 15th, 2019), and EMBASE (1974 to February 15th, 2019). We also examined ongoing trials using the database of clinical trials (https://clinicaltrials.gov). Each searching strategies are shown in the Additional file 1: Appendix B. In hand-searching, we screened the reference list of previous SRs and relevant studies for additional articles potentially not identified through electronic search. We chose the key search terms “knee*,” “arthroplasty OR replacement,” and “patient-specific OR patient-matched OR custom-fit OR custom-made OR custom*.” The search was limited to articles published since 2001, because PSI was initially used in 2001 in institutional studies only, and the first report using PSI was published in 2004 [28].

We adopted a 3-step screening process (title screening, abstract screening, and full-text screening) to select eligible articles. After duplicate articles were removed, two reviewers (KK AND AS) independently performed title and abstract screenings. If either of the reviewers included an article during title or abstract screening, it was moved to the next stage for screening. During full-text screening, discrepancies were resolved through discussion and consensus with the senior authors (FY AND OA). We did not register this protocol in time before data collection was performed. For each study, two reviewers independently extracted data into a spreadsheet for the outcomes designed a priori. Differences were resolved by discussion.

Two reviewers independently assessed risk of bias for RCTs using the Cochrane risk of bias tool [29]. The following domains were assessed: sequence generation; allocation concealment; blinding of participants and personnel (performance bias); blinding of outcome assessment (separately for PROM, transfusion, blood loss, surgery time, and complications); incomplete outcome data (attrition bias); and selective reporting (reporting bias). For non-RCTs included for complications, we used the methodological index for non-randomized studies (MINORS) appraisal tool [30].

Dichotomous outcomes (transfusion and complications) were expressed as a risk ratio (RR) with 95% confidence intervals (CI). Continuous outcomes (i.e. PROM, blood loss, and surgery time) were expressed as a mean difference (MD) between TKA-PSI and standard TKA groups. We preferred to calculate effect size measures (standardized mean difference (SMD)), when studies used different instruments to assess the same outcome (e.g. blood loss).

We analyzed outcomes according to the modified intention-to-treat method without imputation. When data was not expressed with mean and standard deviation (SD), but expressed with median, minimum-maximum range, or interquartile range, we estimated the mean and SD in reference to Wan et al. (2014) [31].

Heterogeneity between pooled studies were assessed using the chi-square test with statistical significance set at p < 0.10 and the I2 statistic [32]. The I2 value was assessed as follows: 0–40% might not be important, 30–60% moderate, 50–90% substantial, and 75–100% considerable. Also, we considered variance of the point estimate and overlap in the confidence intervals.

For assessing reporting biases, we constructed funnel plots for each outcome for which there were at least five trials. We pooled included studies using the generic inverse variance method for continuous outcomes and the Mantel-Haenszel method for dichotomous outcomes in Review Manager 5 (Revman). Either one of fixed-effect or random- effect model was used depending on the heterogeneity. We created a ‘Summary of findings’ for the main comparison. For PROM, we selected an outcome measure in which most patients were pooled each for less than 1-year and for 1-year or more. We pre-specified and carried out sensitivity analyses by excluding non-RCTs in complications. We assessed the quality of evidence related to each outcome measure using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach [33].

Results

Results of the search

We searched from February 15th 2019 to March 15th 2019. We screened a total of 1386 articles (CENTRAL 128, EMBASE 651, MEDLINE 607). After the removal of duplicates, we checked 913 articles in title-screening and 320 articles in abstract-screening processes. We added 13 articles from the references in the pooled articles and implemented full-text screening process. From the database of clinical trials, we included 1 study. Overall, 38 articles were included in this systematic review: 38 studies were included in qualitative synthesis and 37 studies were included in quantitative synthesis (meta-analysis), as seen in Fig. 1. Details of the process of screening are illustrated in PRISMA (Preferred reporting items for systematic reviews and meta-analyses) flowchart in Fig. 1.

Fig. 1
figure 1

PRISMA flowchart

Included studies

We included 38 studies, with a total of 3487 patients (1753 in TKA-PSI, and 1734 in standard TKA) [7, 34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71]. All studies except NCT02539992 were published in 2012–2018. We included 15 studies in PROM (14 studies in meta-analysis), 18 studies in surgery time (18 studies in meta-analysis), 15 studies in blood loss (15 studies in meta-analysis), 9 studies in transfusion rate (8 studies in meta-analysis), and 24 studies in complications (23 studies in meta-analysis). Boonen et al. (2013) and Boonen et al. (2016) were the same cohort and we counted them as the same study. Details of the included studies are provided in the characteristics of included studies table (Table 1).

Table 1 Characteristics of included studies table

In the included studies, 29 studies with pooled 2547 patients explicitly included patients with osteoarthritis, whereas 7 studies with pooled 726 patients recruited patients with knee deformity. Twenty-seven studies used MRI-based PSI, whereas 10 studies used CT-based PSI. Many varieties of PSI device were used and the most prominent device was Visionnaire (Smith & Nephew, Memphis, USA) in 15 studies followed by Materialise (Zimmer via Materialise, Belgium) in 10 studies. Thirteen studies with the pooled 1384 patients followed for 1-year or more.

Excluded studies

In full-text screening process, we excluded 113 studies. We described studies in which discrepancy for agreement was found or excluded with special reasons, being listed in Additional file 1: Appendix C. We found 18 ongoing trials using the database of clinicaltrials.gov, and we searched their results using corresponding author’s name and institution where trials were conducted. A list of ongoing trials was shown in Additional file 1: Appendix D.

Risk of bias in included studies

We assessed the risk of bias in the included studies. In sequence generation, 8 out of 25 RCTs properly generated unpredictable randomization lists, creating low risk. In allocation concealment, 9 studies appropriately distanced from recruiters notifying allocation, whereas two studies were exposed to high risk of selection bias by recruiting participants in turns between PSI and standard TKA groups [35], or using block randomization without blinding block size for recruiters [65]. PSI requires preoperative MRI or CT to create patient-specific cutting blocks, and we assumed patients were notified as to whether they were allocated to the TKA-PSI group. For that reason, in almost all RCTs except for two studies, there was a high risk of bias in the domain of blinding of outcome assessment for PROM. In the two studies, all patients had preoperative MRI or CT and blinding of patients was robustly maintained with low risk of bias [42, 70]. Also, due to the trait of surgical trials, surgeons and scrub nurses could not be blinded. The unblinding potentially caused performance bias for surgery time, blood loss, and transfusion rate with high risk of bias. In incomplete outcome data, the following outcomes (i.e. surgery time, blood loss, transfusion rate) were less likely influenced, since these were perioperatively recorded in medical charts during admission. Six studies in PROM had high risk of attrition bias [42, 51, 53, 58, 68, 71]. In the included studies for the percentage of outliers in the positioning of the prosthetic implants, five studies clearly performed blinding to the outcome assessors for the images, having low risk of outcome detection bias [45, 53, 57, 58, 67].

In non-RCTs, the mean MINORS scores were 16.8 (SD 3.9). Seven studies were prospective non-RCTs, whereas 6 studies were retrospective non-RCTs. In only one study, the main study purpose was measuring complications. We presented the MINORS scores in the included non-RCTs in Additional file 1: Appendix E.

Patient-reported outcome measures

The following outcome measures included two or more studies: KSS knee, KSS function, KSS total, Oxford, WOMAC, KOOS symptom, KOOS pain, KOOS ADL, KOOS sports, KOOS QoL, EQ-5D VAS, SF-12 physical score, and SF-12 mental score.

In patients followed for less than 1-year, 17 studies were included with the pooled 1609 patients. There were no significant differences in KSS knee, KSS function, KSS total, WOMAC, and Oxford scores between TKA-PSI and standard TKA (MD 0.24 (95%CI -2.25 – 3.65) in KSS knee at mean 3-months follow-up; MD 0.56 (95%CI -1.98 – 3.10) in KSS function at mean 3-months follow-up: MD -2.48 (95%CI -10.24 – 5.29) in KSS total at mean 3-months follow-up; MD 0.05 (95%CI -1.69 – 1.80) in WOMAC at mean 3-months follow-up, and MD -0.48 (95%CI -1.92 – 0.97) in Oxford score at mean 3-months follow-up). The MDs were less than minimally clinically important differences (MCID) in KSS knee (MCID 5.3), KSS function (MCID 6.1), WOMAC (MCID 10), and Oxford scores (MCID 4.3) [72,73,74]. There were no significant differences in EQ-5D VAS, SF-12 physical and mental scores between groups: MD -0.56 (95%CI -6.57 – 5.45) in EQ-5D VAS at mean 3-months follow-up; MD 0.51 (95%CI -2.58 – 3.59) in SF-12 physical score at mean 3-months follow-up (MCID 4.5); MD 1.84 (95%CI -3.02 – 6.69) in SF-12 mental score at mean 3-months follow-up (MCID 3.3) [72].

In patients followed for 1-year or more, 13 studies were included with the pooled 1384 patients. There were no significant differences in Oxford, WOMAC, KSS knee and total, KOOS ADL and sports, and EQ-5D VAS between groups: MD 1.00 (95%CI -11.54 – 0.59) in KSS knee at 12-months follow-up, MD -5.22 (95%CI -10.70 – 0.06) in KSS function at 12-months follow-up, MD -2.51 (95%CI -14.18 – 9.15) in KSS total at mean 18.8-months follow-up, MD 2.66 (95%CI -1.34 – 6.67) in Oxford at mean 20.0-months follow-up, MD 0.25 (95%CI -4.39 – 4.89) in WOMAC at mean 21.6-months follow-up, MD 6.09 (95%CI 6.09–0.03 – 12.21) in KOOS ADL at mean 18.3-months follow-up, MD -2.39 (95%CI -19.99 – 15.20) in KOOS sports at mean 18.3-months follow-up, and MD -1.38 (95%CI -6.87 – 4.10) in EQ-5D VAS at mean 24-months follow-up). We found differences in KSS function, and KOOS symptom, pain, and QoL between groups, however, the pooled patients were small (less than 150) and the differences were less than the MCID: MD -5.34 (95CI -10.50 – − 0.18) in KSS function at mean 12-months follow-up (MCID 6.1), MD 5.23 (95%CI 0.11–10.35) in KOOS symptom at mean 18.3-months follow-up (MCID 10.7), MD 9.67 (95%CI 3.88–15.46) in KOOS pain at mean 18.3-months follow-up (MCID 16.7), MD 9.77 (95%CI 2.56–16.97) in KOOS QoL at mean 18.3-months follow-up (MCID 15.6) [73, 75]. We present more details in the forest plots in Figs. 2, 3, 4, 5, 6 and 7.

Fig. 2
figure 2

Forest plots in KSS

Fig. 3
figure 3

Forest plots in Oxford

Fig. 4
figure 4

Forest plots in WOMAC

Fig. 5
figure 5

Forest plots in KOOS

Fig. 6
figure 6

Forest plots in EQ-5D VAS

Fig. 7
figure 7

Forest plots in SF-12

Two studies adopted unique outcome measures (i.e. Kujala [65] and Lysholm [35] scores), but they did not show differences in the outcome measures between TKA-PSI and standard TKA groups.

Surgery time

Eighteen studies were included with the pooled 1592 patients. There was high heterogeneity between studies with I2 = 94%. The pooled mean difference was MD − 3.09 min (95%CI -6.73 – 0.55) and there was no significant difference between groups, as shown in Fig. 8.

Fig. 8
figure 8

Forest plot in surgery time

Blood loss

Fifteen studies were included with various measuring scales and different time points when blood loss was measured postoperatively. In measuring blood loss, 5 studies measured volume of drainage fluid from suction drain device [34, 36, 39, 51, 67], 5 studies measured perioperative hemoglobin reduction [35, 42, 54, 64, 70], 3 studies measured intraoperative bleeding [7, 55, 68], 1 study calculated blood loss using the Mercuriali & Inghilleri formula [53, 76], and 1 study lacked information about the measurement [44]. The time points were most prominently at intraoperative in 3 studies, postoperative 24 h in 3 studies, 48 h in 2 studies, lowest hemoglobin in 1 study, and not shown in 4 studies. TKA-PSI decreased blood loss as compared to standard TKA with SMD -0.36 (95%CI -0.57 – − 0.15, p = 0.001) as seen in Fig. 9, although the effect size was small equivalent to hemoglobin 0.4 g/dl (95%CI 0.18–0.88) reduction. The effect size was estimated using the calculated SMD (− 0.36, 95%CI -0.57 - -0.15) and the pooled SD (1.2 g/dl) in hemoglobin decrease from a previous review examined blood loss after TKA [77, 78]: 0.43 g/dl (95%CI 0.18–0.88) = 0.36 (95%CI 0.15–0.57) × 1.2 g/dl.

Fig. 9
figure 9

Forest plot in Blood loss (SMD)

Transfusion rate

Nine studies were chosen in qualitative synthesis and 8 studies were meta-analyzed with the pooled 487 patients. The overall transfusion rate was 17.3% (98/567), whereas two studies did not have any cases with transfusion. There was not significant difference in transfusion rate between TKA-PSI and standard TKA group with risk difference − 0.14 (95%CI -0.33 – 0.05, p = 0.16), as seen in Fig. 10. Pietsch et al. (2013) did not specify transfusion rate, but they described that there was no difference in transfusion rate in the trial.

Fig. 10
figure 10

Forest plot in transfusion rate

Complications

Overall, 24 studies were included with 11 RCTs and 13 non-RCTs. For SSI, 18 studies with 8 RCTs and 10 non-RCTs were included in meta-analysis. The incidence of SSI in the follow-up periods was 1.2% (24/2067). We measured a composite outcome consisting of SSI, DVT, and revision TKA. We did not find any difference in the composite outcome between TKA-PSI and standard TKA groups: risk difference 0.00 (95%CI -0.01 – 0.01, p = 0.73), as shown in Fig. 11. In sensitivity analysis excluding non-RCTs, the result was consistent: risk difference 0.01 (95%CI -0.02 – 0.03, p = 0.46).

Fig. 11
figure 11

Forest plots in complication rate (composite outcome)

There was no significant difference in SSI between TKA-PSI and standard TKA groups: 13 out of 1049 patient in TKA-PSI versus 11 out of 1018 patients in standard TKA with risk difference 0.00 (95%CI -0.01 – 0.01, p = 0.83) as seen in Additional file 1: Appendix F. In sensitivity analysis excluding non-RCTs, the result was consistent: risk difference 0.01 (95%CI -0.02 – 0.03, p = 0.59) in RCTs.

For DVT, 16 studies were included in meta-analysis with 7 RCTs and 9 non-RCTs. The incidence of DVT was 1.5% (12/1716) in the pooled trials. There was no significant difference between groups: 9 out of 881 patients in TKA-PSI versus 3 out of 835 in standard TKA with risk difference 0.01 (95%CI -0.01 – 0.02, p = 0.28) as seen in Additional file 1: Appendix G. In sensitivity analysis excluding non-RCTs, the result was consistent: risk difference 0.01 (95%CI -0.01 – 0.03, p = 0.44) in RCTs. Pulmonary emboli in Boonen et al. (2016) and NCT 02539992 were counted as DVT, considering the overlap between pulmonary embolism and DVT.

For revision TKA, 14 studies were included and 13 studies were meta-analyzed with 4 RCTs and 9 non-RCTs. The incidence of revision TKA was 0.9% (11/1227) and there was no significant difference between groups: 3 out of 601 patients in TKA-PSI versus 8 out of 626 in standard TKA with risk difference − 0.01 (95%CI -0.02 – 0.01, p = 0.83) as seen in Additional file 1: Appendix H. The specific reasons for revision TKA were infection (n = 1), tibia loosening (n = 1), and patella resurfacing (n = 1) in TKA-PSI group, whereas infection (n = 2), tibia loosening (n = 2), patella resurfacing (n = 2), and instability (n = 2) in standard TKA group. Pourgiezis et al. (2016) experience one case with revision TKA in TKA-PSI group, but the data was not synthesized in meta-analysis, because the number of events in standard TKA was not shown. Also, Rathod et al. (2015) experienced reoperation for hematoma in TKA-PSI group, but we did not the case as revision TKA. Findings of the primary outcomes are summarized in ‘Summary of findings’ (Fig. 12).

Fig. 12
figure 12

Summary of findings

Lower-limb alignment (secondary outcome)

Eight studies were included with the pooled 930 patients [45, 49, 53, 57, 58, 67, 68, 70]. Lower-limb alignment was monitored using hip-knee-angle (HKA) [53, 58, 67, 70] and mechanical axis [45, 49, 57, 68]. The positioning of the femoral prosthetic implant was examined using coronal [45, 49, 53, 57, 58, 67, 70] and sagittal [49, 53, 57, 58, 70] radiographs or CT scans. Also, rotational alignment of the femoral prosthetic implant was examined using CT scans [45, 58, 70]. The positioning of the tibial prosthetic implant was examined using coronal [45, 49, 53, 57, 58, 67, 70] and sagittal [45, 49, 53, 57, 58, 70] radiographs or CT scans. In the included studies, rotational alignment of the tibial prosthetic implant was not examined. All studies defined outliers as three or more degrees deviation from the planned alignment.

There were no significant differences in HKA (95%CI -0.09 – 0.05, p = 0.58), mechanical axis (95%CI -0.21 – 0.16, p = 0.81), femoral coronal positioning (95%CI -0.05 – 0.05, p = 0.95), femoral sagittal positioning (95%CI -0.13 – 0.14, p = 0.91), femoral rotational positioning (95%CI -0.32 – 0.07, p = 0.21), tibial coronal positioning (95%CI -0.03 – 0.05, p = 0.60), and tibial sagittal positioning (95%CI -0.16 – 0.10, p = 0.63) when comparing between TKA-PSI and standard TKA, as shown in Additional file 1: Appendix I.

Discussion

This systematic review included 38 studies (26 RCTs and 12 non-RCTs) to evaluate the efficacy of TKA using PSI as compared to standard TKA for patients with end-stage knee OA. For PROM, TKA-PSI did not show superior outcomes among patients followed for less than 1-year. Also, among patients followed for 1-year or more, we could not find clinically important differences between TKA-PSI and standard TKA groups. Lower-limb alignment and prosthetic implant positioning did not differ between TKA-PSI and standard TKA groups. TKA-PSI decreased perioperative blood loss, though the effect size was small. TKA-PSI did not reduce transfusion rate and surgery time. The most striking of this review was that we investigated three prominent complications (i.e. SSI, DVT, and revision TKA) and overall complication rates in TKA-PSI were small: 1.3% in SSI, 1.0% in DVT, and 0.5% in revision TKA in the short-term follow-up periods (maximum 44-months). We did not find any differences in complication rates between TKA-PSI and standard TKA groups, but the pooled events were insufficient to draw a conclusion.

We found 12 other systematic reviews assessing the efficacy of PSI as compared to standard TKA. Radiographic, CT-, or MRI-identified alignments of the components (i.e. mechanical axis) were the most common main outcomes in 9 reviews. Two reviews showed the efficacy of PSI for favorable mechanical axis and femoral rotational alignment. However, the effects of the favorable alignments on PROM and reduction of revision TKA were not reported in their reviews. Goyal et al. (2016) described that a meta-analysis with 5 RCTs showed no significant differences in PROMs between TKA-PSI and standard TKA, but not conclusive with limited numbers of the pooled patients (379 patients). Mannan et al. (2017) demonstrated that TKA-PSI did not improve PROM compared to standard TKA, including non-RCTs. In this review, we limited to RCTs and meta-analyzed the pooled 1299 patients (666 in TKA-PSI versus 633 in standard TKA) in PROM, concluding that PSI did not improve PROM among patients followed both for less than 1-year and for 1-year or more. The conclusion was consistent with the previous systematic reviews. Thienpont et al. (2017) reviewed that TKA-PSI decreased blood loss and surgery time; however, the potential benefits, such as decreasing transfusion, SSI and DVT, were not reported. In our review, TKA-PSI decreased blood loss with a small effect which corresponded to hemoglobin 0.4 g/dl (95%CI 0.1–0.7) reduction, but did not shorten surgery time. TKA-PSI did not reduce transfusion, SSI, and DVT rates.

The most noteworthy strength of this review is robust methodology: this review followed guidance from the Cochrane handbook of Systematic Reviews [79]. In our search of potentially eligible studies, we screened ongoing clinical trials. Also, we selected the main outcomes avoiding surrogate or interim outcomes [79]. Another strength is generalizability to the target population of interest. Among the included studies, 29 studies recruited patients solely diagnosed with knee OA. The prevalence of females in the pooled population was 62%, which highly reflected the general population [80]. Also, their ages were almost within 60–79, which is the predominant population in need of TKA [81]. Overall, the characteristics of the pooled population in this review accurately reflected the prevalence of knee OA in clinical practice. Also, 1361 patients were pooled in PROM and the generalizability of this review is favorable.

The most remarkable issue in the certainty of evidence was inappropriate blinding. Patients could not be blinded in the trait of PSI with preoperative MRI or CT. Also, surgeons could not be blinded, which potentially created performance bias. For these reasons, we interpreted the results in PROM as having a moderate quality of evidence. For surgery time and blood loss, each point estimate in the included studies varied with high heterogeneity (I2 = 94%), and we interpreted the evidence as being moderate in quality. The results in transfusion and complication rates were imprecise because of the small number of the pooled events with wide confidence intervals. We did not downgrade, but we suspected publication bias in the funnel plot in transfusion rate in Additional file 1: Appendix J. We interpreted them as having a moderate quality of evidence. The assessment sheet of quality of evidence for each outcome is presented in Additional file 1: Appendix K.

In limitation, we identified 4 studies in non-English literatures in the searching process, but we could not include the data in this review for our inability to literate the articles (language bias), as listed in Additional file 1: Appendix C. Also, in two studies, SD were not presented and we estimated the SD referring to the other studies in this review: Roh et al. (2013) and Boonen et al. (2013) for surgery time and blood loss. In this review, the longest follow-up periods were 44-month and future systematic reviews including studies with longer follow-up periods would be needed for examining long-term efficacy in addition to cost effectiveness using PSI. Also, a newer surgical device and procedure to perform robotic-arm assisted TKA, has been recently introduced with potentially superior clinical outcomes as compared to standard jig-based TKA [82]. Further studies are required to elucidate the full benefit and any limitations.

Conclusions

TKA using PSI does not improve PROMs, surgery time and transfusion rate as compared to standard TKA among patients with end-stage OA of the knee followed for less than 1-year and for 1-year or more. TKA using PSI decreases blood loss with a small effect, but the effect is not enough to decrease transfusion rate. TKA using PSI does not reduce surgery time, and if it does, the degree of reduction is not clinically significant. TKA using PSI may not reduce SSI, DVT, and revision TKA, but they are inconclusive.