FormalPara Key Points for Decision Makers

Compared with the prostate-specific antigen test, the Stockholm 3 (STHLM3) test was more effective at correctly classifying individuals, however at additional costs.

Variations in the cost of the STHLM3 test were identified as the parameter with the greatest influence on the results.

A lower cost of the STHLM3 test would improve its cost effectiveness compared with the prostate-specific antigen test and the threshold analysis showed that at a STHML3 cost of €138.70, the STHLM3 test would become cost neutral versus the prostate-specific antigen test.

1 Introduction

Prostate cancer (PCa) is the most common cancer among men in the Nordic countries, where it has been estimated that in the next 20 years its incidence will increase by 23–53% because of population aging [1]. Until recently, the standard diagnostic pathway for PCa involved a blood-based prostate-specific antigen (PSA) test followed by a systematic transrectal ultrasound-guided biopsy (TRUS-Bx) of the prostate, which has been shown to reduce PCa mortality [2,3,4,5]. However, both PSA and TRUS-Bx have demonstrated poor test accuracy, leading to unnecessary prostate biopsies (with a risk of inducing sepsis) and high rates of overdiagnosis and overtreatment of clinically insignificant PCa [2, 4, 6,7,8].

Magnetic resonance imaging (MRI) has emerged as a suitable alternative to improve the detection of malignant prostatic lesions, showing high sensitivity for clinically significant PCa [4, 6]. Furthermore, MRI decreases the detection of low-risk PCa and spares men without MRI lesions from biopsies [4, 9]. Thus, results from multiple clinical studies have shown that MRI followed by MRI-targeted biopsies has a higher sensitivity for the detection of significant cancer, while also decreasing the detection (likely overdiagnosis) of insignificant cancer compared to systematic biopsies [4, 10,11,12].

In 2020, MRI led to a paradigm shift in the European Association of Urology guidelines for the early diagnosis of PCa, where an MRI before biopsy (“MRI-first” strategy) was recommended and any further biopsies should be approached as an MRI-targeted biopsy instead of a random TRUS-Bx [13]. However, the widespread use of prostate MRI is currently hampered by costs and access to uroradiologists. The appropriate selection of patients for MRI, and the definition of optimal protocols for MRI sequences and active surveillance follow-up programs are other key issues in applying MRI [6]. Hence, the development of better biomarkers, which may be used as a more accurate pre-selection test for MRI than PSA, could help to reduce the bottleneck and improve the quality of early PCa diagnostics.

The Karolinska Institute in Sweden developed the Stockholm 3 (STHLM3) test, which has been proposed as an alternative to PSA testing to improve the early detection of clinically significant PCa [3]. STHLM3 is a blood test that includes PSA and four other plasma proteins, 101 genetic markers (single nucleotide polymorphisms), and clinical information about the patient (age, family history, previous prostate biopsy, and use of 5-alpha reductase inhibitors) [3, 14]. In a large population study in Sweden, the STHLM3 model (as compared to PSA) was shown to reduce the number of TRUS-Bx by one third, while maintaining the same sensitivity to clinically significant PCa, defined as a Gleason score ≥ 7 [3]. A subsequent Swedish study modeled the effect of the STHLM3 test if it replaced current clinical practice (PSA and TRUS-Bx) and found that STHLM3 testing had the potential to substantially reduce the number of biopsies while maintaining the same sensitivity to diagnose clinically significant PCa [14]. Another Swedish study investigated STHLM3 in combination with MRI and found that STHLM3 performs at least as well as PSA in detecting clinically significant PCa, while also reducing both the number of MRI procedures and the patients referred for biopsy [15]. A recent Norwegian study demonstrated fewer referrals for prostate biopsy and a higher proportion of clinically significant PCa findings in biopsies performed by replacing PSA with STHLM3 in the Stavanger region for early detection of PCa in primary care [16].

Therefore, evidence suggests that STHLM3 can improve the PCa diagnostic process compared with PSA with the source of value of STHLM3 being a reduction in the number of false-positive and low-grade PCa cases detected. However, the assessment of the costs associated with the potential implementation of STHLM3 may be of similar importance to decision makers. Thus, economic evaluations are useful because they provide a means of comparing the costs and consequences on patient outcomes of different approaches, which is important for evidence-based policies and decision making [17].

Few previous studies have evaluated the costs and cost effectiveness of replacing PSA with STHLM3. The previously mentioned Norwegian study performed a simple cost analysis based on estimated costs from Stavanger University Hospital in combination with outcome data from 4784 men tested with STHLM3 and found that the implementation of STHLM3 was associated with a decrease in direct healthcare costs [16]. Furthermore, a recent Swedish micro-simulation study applied a lifetime societal perspective to investigate the cost effectiveness of introducing STHLM3 as a reflex test in population-based screening. The compared scenarios were: (i) no screening, (ii) screening using PSA, and (iii) screening using STHLM3 as a reflex test for PSA values ≥ 1, 1.5, and 2 ng/mL, respectively. The results showed that relative to the PSA test, the STHLM3 reflex thresholds of 1, 1.5, and 2 ng/mL had incremental cost-effectiveness ratios (ICERs) of €170,000, €60,000, and €6000 per quality-adjusted life-years (QALYs), respectively [18].

Currently, no studies have investigated the cost effectiveness of introducing STHLM3 in comparison to PSA applying the time horizon of the PCa diagnostic pathway. Therefore, the objective of this study was to assess the cost effectiveness of STHLM3 compared to PSA as the primary blood test in the diagnostic work-up of PCa, when used in an opportunistic testing setting as currently applied in routine PCa diagnostics in Denmark.

2 Materials and Methods

To ensure transparency and structure, the reporting of this study followed the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) [19]. The study applied a hospital perspective. Costs were presented in 2021 Euros (€) using a currency conversion rate of €1 = 7.45 Danish kroner. The time horizon was restricted to the PCa diagnostic pathway, which starts with the initial PSA or STHLM3 test and ends with the histopathological diagnosis of the biopsy. Because of the short time horizon (< 1 year), costs and effects were not discounted.

2.1 Population

The cost-effectiveness analysis was conducted using a hypothetical cohort of men aged 50–69 years. This specific population age was chosen to be consistent with the initial study of STHLM3, which was conducted in this age group [3]. The population was defined as male individuals referred to a PCa diagnostic work-up based on opportunistic testing.

2.2 Measure of Effectiveness

The chosen measure of effectiveness was the number of correctly classified individuals. The target condition was clinically significant PCa, defined as an International Society of Urological Pathology grade group of 2 or higher, based on histopathological findings and scored as a Gleason score 3 + 4 or higher [20]. Effectiveness was calculated as the sum of true positives and true negatives in relation to the target condition.

2.3 Model Structure

For the economic evaluation, we used decision-analytic modeling, which allowed us to use evidence from various sources and to assess uncertainties related to the model structure and input parameters of the model [21]. To address the objectives of this study, we created a decision tree to model and compare the cost effectiveness of the PCa diagnostic pathway using PSA (current standard) or STHLM3 (new alternative) as the initial diagnostic test. The strategies simulated in the analytical model were constructed based on the current PCa diagnostic process recommended in Denmark [22], which is widely consistent with the European Association of Urology guidelines [13]. The final model structure was validated by the clinical prostate radiologist from the author group (BGP) and by a multidisciplinary team of clinicians from the NorDCaP consortium specialized in urology, general practice, and clinical biochemistry. The NorDCaP Consortium is a Nordic collaboration between Stavanger University Hospital in Norway, Mehiläinen in Finland, OncoAlgorithm, Karolinska Institutet, Saint Göran Hospital in Sweden, and Aarhus University Hospital in Denmark, who conducts multiple studies investigating potential improvements in the diagnostic pathway of PCa. The diagnostic strategies consisted of initial testing using PSA or STHLM3. If the initial test was positive, it was followed by a urological examination and MRI. If the MRI was positive, it was followed by an MRI-targeted biopsy. As prostate biopsy is associated with the risk of inducing infection and even life-threatening sepsis due to increased antibiotic resistance [23], we added sepsis to the model as a potential event for individuals undergoing MRI-targeted biopsy.

The probability of a positive/negative classification was derived using a Bayesian approach applying population prevalence, sensitivity, and specificity for each diagnostic test/procedure. The model was constructed using TreeAge Pro® 2019, R2 software [24]. A simplified model is shown in Fig. 1 and the full model diagram can be assessed in the Electronic Supplementary Material (ESM).

Fig. 1
figure 1

Decision tree illustrating the compared diagnostic pathways of prostate cancer. ISUP International Society of Urological Pathology, MRI magnetic resonance imaging, PI-RADS Prostate Imaging Reporting and Data System, PSA Prostate-Specific Antigen, STHLM3 Stockholm 3

2.4 Index Tests

Prostate-specific antigen testing (index test 1a) is widely used as an initial test for the early detection of PCa. Intending to investigate the gray-zone area related to lower PSA values, within which the risk of false-positive results is highest, we defined PSA concentrations of 3–10 ng/mL as positive. Thus, individuals with PSA concentrations > 10 ng/mL were not included in the model.

The STHLM3 score (index test 1b) was based on the results of the previous STHLM3 diagnostic study, involving 59,000 men [3]. A STHLM3 test was considered positive if the risk of clinically significant PCa was ≥ 10%.

Magnetic resonance imaging (index test 2) is used to identify and locate suspicious lesions for clinically significant PCa and is reported according to the Prostate Imaging Reporting and Data System [25]. Here, we defined the default threshold for MRI positivity as a Prostate Imaging Reporting and Data System ≥ 3. An MRI-targeted biopsy (index test 3) was performed only for men with a positive MRI. We defined a positive MRI-targeted biopsy as a histopathological confirmation of the target condition. The thresholds for positivity for the different diagnostic tests in the model are shown in Table 1.

Table 1 Thresholds for positivity

2.5 Model Parameters

Literature searches using PubMed and The Cochrane Library databases were performed to select model inputs for population prevalence, the performance of diagnostic tests, and the probability of developing sepsis. Systematic reviews were preferred, and single studies were selected if the thresholds for positivity and target conditions were in accordance with definitions used in this study. Where no recent systematic reviews were available, the test accuracy was based on single studies, and probability estimates were synthesized based on relevant studies.

Costing input was estimated using Danish Diagnosis-Related Group tariffs. The Diagnosis-Related Group tariffs include all expenses accrued by the hospital, representing an average estimate of costs associated with hospital-based medical services and procedures. The costs of the initial tests were determined using a micro-costing approach.

2.5.1 Population Prevalence

Base-case analysis was performed as a cohort analysis based on summary statistics of men showing an increased risk of clinically significant PCa defined as an International Society of Urological Pathology grade group ≥ 2. The increased risk was defined as PSA values of 3–10 ng/mL and/or STHLM3 ≥ 10%. A recent Cochrane review investigating the performance of MRI and MRI-targeted biopsies in PCa detection applied a population prevalence of 30% [26]. We applied the same population prevalence of 30% in our model.

2.5.2 Test Performance

Studies investigating PSA performance using the positivity threshold of 3–10 ng/mL for detecting grade group ≥ 2 PCa in a population with an increased risk of PCa are lacking. Therefore, the relative sensitivity and specificity of PSA and STHLM3 were calculated based on the prospective population-based diagnostic study of STHLM3, where participants with PSA levels ≥ 3 and/or STHLM3 ≥ 10% had a biopsy sample taken (n = 4947) [3]. In the study by Grönberg et al. [3], the STHLM3 test was calibrated to show the same sensitivity as PSA for the detection of clinically significant PCa. Thus, the same estimated sensitivity of 0.84 (95% confidence interval [CI] 0.81–0.87) was applied in the model for both PSA and STHLM3. The relative specificity was calculated to be 0.21 (95% CI 0.19–0.22) for PSA and 0.50 (95% CI 0.49–0.52) for STHLM3. Calculations can be assessed in ESM.

The sensitivity and specificity of MRI and MRI-targeted biopsy were obtained from the recent Cochrane Systematic Review by Drost et al. [26] and were assumed to be the same for both alternative Pca diagnostic pathways. The review found no statistically significant differences in the detection ratio between studies using bi-parametric or multi-parametric MRI and between studies using a software or a cognitive MRI-targeted biopsy technique [26]. Thus, in our study, MRI was assumed to include both types of pulse sequences and MRI-targeted biopsy included both types of biopsy techniques. The sensitivity applied in our model was 0.91 (95% CI 0.83–0.95) for MRI and 0.8 (95% CI 0.69–0.87) for MRI-targeted biopsy. A specificity of 0.37 (95% CI 0.29–0.46) was achieved for MRI, and 0.94 (95% CI 0.90–0.97) for MRI-targeted biopsy [26].

2.5.3 Probability of Sepsis

The estimation of the probability of developing sepsis after a biopsy originated from a synthesis of estimates from the international literature. The synthesis was based on original studies that reported the hospitalization rate after a transrectal biopsy from a 2017 systematic review by Borghesi et al. [23]. In addition to the selected articles in the review, we performed a systematic literature search that included relevant articles published between October 2015 (the date of the last literature search in the review) and June 2021. Because of the reported increased risk of sepsis caused by increasing antibiotic resistance [23], the probability estimation was limited to include only studies published from 2011 to 2021. For the data synthesis, we included 36 original studies. Details of sepsis reporting varied between the 36 studies, and for the present analysis, we decided to include cases reported as sepsis, urinary tract infection/sepsis, or severe infection requiring hospitalization when determining the probability of sepsis after transrectal biopsy. Original data from the 36 included studies [27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62] were synthesized using Metaprop, a Stata command to perform a meta-analysis of binomial data [63] (ESM). The mean probability of sepsis after the transrectal biopsy was calculated at 0.02 (95% CI 0.02–0.03), which was applied in the model. The analyses were performed using STATA 17.

2.5.4 Costs

The cost of the PSA test was estimated to be €2 according to information from the Department of Clinical Biochemistry of Aarhus University Hospital, Denmark. The cost of the STHLM3 test, including test, analysis, local handling/administration, and shipping to Stockholm, was estimated to be €243 based on information from the Department of Molecular Medicine at Aarhus University Hospital. A table presenting the cost calculations of STHLM3 can be found in the ESM. The test costs did not include consultation and bloodwork performed by the general practitioner, as it was assumed that this procedure would be the same for both testing strategies.

The costs of the different follow-up tests and procedures, as well as the potential hospitalization and treatment due to sepsis, were obtained from the tariffs report published by the Danish Health Data Authority [64]. According to clinical guidelines, patients referred to MRI would also undergo a urodynamic examination in the hospital. Thus, the cost of this examination was added to the MRI pathways in the model. The costs applied in the model are shown in Table 2.

Table 2 Cost of procedures in the prostate cancer diagnostic pathways (2021 Euros)

2.6 Model Assumptions

During the conceptualization and construction of the decision tree, different model assumptions were made as a basis for the analysis. The assumptions can be assessed in the ESM.

2.7 Analyses

To evaluate the potential cost effectiveness of adopting the STHLM3 test as a primary diagnostic test, an incremental approach was used to identify additional costs and effects. Estimated values from the decision-analytic model were used to calculate the ICER by dividing incremental costs by incremental effects, representing the cost per unit of effect gained, measured as correctly classified individuals.

2.8 Sensitivity Analyses

The model-based analysis was a simplification of real-world scenarios where input parameters were estimated based on the best available evidence, implying that the evidence may be subject to uncertainty. Both deterministic and probabilistic sensitivity analyses were conducted to test the robustness of the base-case analysis. Deterministic sensitivity analysis was applied to investigate parameter uncertainties, and the probabilistic sensitivity analyses investigated stochastic parameter uncertainties based on the distributions of the input parameters of the decision tree [21].

2.8.1 Deterministic Sensitivity Analysis

The deterministic sensitivity analysis was performed as a tornado analysis, identifying the variables that had the greatest influence on the ICER result produced by the model. In this analysis, one-way analyses of single variables were plotted in a single chart; the ICER calculated from the given point estimates were compared to the ICER of the base-case analysis. As wider bars indicate greater variation in the outcome when the variable changes within the plausible range, the tornado diagram ranks the variables according to the degree of influence on the result [65]. Table 3 shows the input applied for the deterministic sensitivity analysis. Furthermore, we performed a threshold analysis of the cost of STHLM3 to investigate how much the cost of STHLM3 would need to be altered for the incremental cost per correctly classified individual to be €0.00 (ICER = 0).

Table 3 Input for the deterministic sensitivity analysis

2.8.2 Probabilistic Sensitivity Analysis

By including distributions of plausible ranges for values used in the base-case analysis, the variables were randomly sampled to calculate ICERs for each input simultaneously within its distribution, using second-order Monte-Carlo simulations. We applied the beta distribution for sensitivity, specificity, and probability, and the gamma distribution for costs. The standard deviation was calculated as the square root of the calculated variance. When running multiple ICER iterations (5000), the uncertainty was visualized as an incremental cost effectiveness scatterplot, providing information of the robustness from a base-case analysis [65]. Table 4 shows the input applied for the probabilistic sensitivity analysis.

Table 4 Input for the probabilistic sensitivity analysis

3 Results

The results from the model-based analysis showed that the STHLM3 test was more effective but also more expensive than PSA testing. The calculations presented in Table 5 showed an ICER of €511.7 [95% credible interval, 359.9–674.3], which corresponds to the cost of an additional correctly classified individual when using STHLM3 compared to PSA.

Table 5 Calculation of incremental cost effectiveness

3.1 Sensitivity Analyses

3.1.1 Deterministic Sensitivity Analysis

The deterministic sensitivity analysis is presented in Fig. 2. The analysis showed that variations in the cost of STHLM3 had the greatest influence on the ICER results, meaning that a potential decrease in the costs of the STHLM3 test would substantially lower the ICER. Variations in the population prevalence of PCa were also shown to have a large influence on the ICER, where a lower population probability of PCa would result in a lower ICER. Variations in the costs of the urological examination, MRI, or PSA testing were also shown to impact the ICER, where costlier procedures/tests would result in a decrease in the ICER. The cost of MRI-targeted biopsy, the frequency, and the cost of sepsis were shown to be of minor importance to the results.

Fig. 2
figure 2

Tornado diagram of the deterministic sensitivity analysis. Bar colors: blue indicates lower incremental cost-effectiveness ratio (ICER); red indicates higher ICER. EV expected value, MRI magnetic resonance imaging, PCa prostate cancer, PSA prostate-specific antigen, STHLM3 Stockholm 3

The threshold analysis showed that at a STHML3 cost of €138.70, the ICER was 0, meaning that at this threshold the STHLM3 test would become cost neutral compared with the PSA test. The threshold analysis can be assessed in the ESM.

3.1.2 Probabilistic Sensitivity Analysis

The probabilistic sensitivity analysis based on 5000 iterations, including stochastic variation in input parameters, showed that all simulations positioned in the northeast quadrant of the ICE scatterplot, where the new technology was shown to be more effective and more costly than the comparator (Fig. 3). The cost-effectiveness acceptability curve, illustrating the probability of STHLM3 being cost effective compared with PSA at different threshold values, showed that STHLM3 had a 100% probability of being cost effective at a willingness to pay of €700 for an additional correctly classified individual (Fig. 4).

Fig. 3
figure 3

Incremental cost effectiveness of Stockholm 3 (STHLM3) compared to prostate-specific antigen (PSA)

Fig. 4
figure 4

Cost-effectiveness acceptability curve. PSA prostate-specific antigen, STHLM3 Stockholm 3

4 Discussion

The results of the base-case analysis showed that the STHLM3 was more effective and more costly than PSA, with an ICER of €511.7 [95% credible interval, 359.9–674.3], which was the cost of one additional correctly classified individual. The deterministic sensitivity analysis showed that the variations in the cost of STHLM3 had the most profound influence on the ICER results and that a potential decrease in the costs of the STHLM3 test would substantially lower the ICER possibly making the STHLM3 cost effective. The probabilistic sensitivity analysis, based on 5000 iterations including the stochastic variation in the input parameters, showed that STHLM3 had a 100% probability of being cost effective at a willingness to pay of €700 for an additional correctly classified individual.

While the present study focused on the PCa diagnostic pathway in a contemporary clinical setting characterized by opportunistic PSA testing, a systematic review of 2018 decision-analytical models investigated the cost effectiveness of PCa screening [67]. Four of the ten studies included in the review identified strategies that might be cost effective; however, these studies found that the results were sensitive to the specific quality-of-life values used. Therefore, the review concluded that despite several model-based evaluations, the cost effectiveness of screening was unclear and that robust evidence to inform cost effectiveness was lacking. The review recommended further research based on clinically verified models, which should be supplemented by country-specific data along with prospective quality-of-life data [67]. Similarly, a recent study based on Dutch population data investigated the harms, benefits, and cost effectiveness of 230 different PSA-based screening strategies using a micro-simulation analysis (MISCAN) model [68]. This study concluded that the most optimal strategy would be screening with 3-year intervals at ages 55–64 years, resulting in an ICER of €19,733 per QALY. According to the authors’ analyses, screening before age 55 years and screening after age 64 years were not preferred strategies [68].

Another recent Swedish study aimed to assess the long-term health effects and cost effectiveness of five screening interventions; no screening, PSA screening, and STHLM3 screening at three different reflex thresholds using a micro-simulation model [18]. Compared to no screening, all screening strategies had ICERs that had a moderate to high cost per QALY gained, which indicated a high cost to society from PCa screening. Supporting the results of our study, the study by Karlsson et al. [18] concluded that the cost effectiveness of the STHLM3-based screening was sensitive to the cost of the STHLM3 test, and they argued that a decrease in the cost of the STHLM3 test could be expected with greater use.

The present analysis investigated STHLM3 compared to PSA used as a primary diagnostic test applied in an opportunistic testing strategy. This meant that the estimated disease prevalence of the modeled population was high, compared with the general male population. The mean prevalence of PCa among men has been shown to increase with older age from 5% at age < 30 years to 59% by age > 79 years (67). However, the probability of high-risk potentially lethal PCa may be expected to be very different from the probability of any (possibly indolent) PCa. A Swedish register-based population study investigated the probability of different PCa severity and how the probability was influenced by family history and age (68). The study found a mean population probability of non-low-risk PCa (Gleason score ≥ 7 and/or T3–4 and/or PSA ≥ 10 ng/mL and/or N1 and/or M1) at the age of 65 years of 2.8% (95% CI 2.7–2.8) for the general Swedish population (68). Autopsy studies have previously investigated the prevalence of PCa among cases with no history of urological disease and determined the prevalence of Gleason 7 or greater cancers of 8% [69] and 8.5% [70]. Thompson et al. investigated the prevalence of PCa among men with a PSA level ≤ 4.0 and found an overall prevalence of 2.3% of high-grade cancers (Gleason score ≥ 7); however, the prevalence of high-grade cancers increased with increasing PSA levels [8].

The sensitivity and specificity of clinical tests are independent of the population tested, but the predictive value of the test is highly dependent on the prevalence of the disease in the population of interest (65, 66). Low population prevalence would lead to a significant increase in numbers of false positives and substantial additional costs when compared with testing a population at risk. Therefore, because of the modeled cohort in this analysis, the results are not applicable if considering implementation of STHLM3 testing as a potential screening strategy and thus, are not directly comparable to the cost-effectiveness results from population-based screening studies. Further limiting the generalizability of the results, this analysis aimed to compare the cost effectiveness of STHLM3 and PSA in the diagnostic gray zone area of lower PSA values, thus only considering their application in men presenting with PSA values of 3–10 ng/mL.

Currently, no population-based screening strategy is recommended for PCa and, in the absence of organized PSA tests, opportunistic testing has become common practice in several European Union member states, as well as in Denmark [71]. However, evidence suggests that this approach is associated with overdiagnosis. The European Association of Urology has developed a new risk adapted early PCa detection strategy, which is based on PSA testing, risk calculators, and MRI [71]. It has been advocated that the risk-adapted strategy be endorsed by the European Commission in its 2022 plan and that individual countries be requested to incorporate the risk-adapted strategy into national cancer plans [71].

4.1 Strengths and Limitations

A strength of this study was that the analysis was based on test accuracy data from a systematic review of the PCa accuracy literature, which represents the best available data. However, there are some limitations in the analysis that must be considered in the interpretation of the results.

Determining a potential PCa diagnostic strategy should take multiple additional factors into account, which were not assessed in our study where we only included the diagnostic pathway. To fully assess the costs associated with a diagnostic strategy, a lifetime horizon must be adapted, considering, for example, active surveillance strategies, repeated diagnostic activities, treatment costs, and complications associated with these activities. However, this was beyond the scope of this study and hence may be considered as a limitation. Furthermore, the applied measure of effectiveness was correctly classified individuals, which was directly related to the test performance. This measure of effectiveness complicates the ability to evaluate whether STHLM3 is cost effective, as there is no explicit willingness-to-pay threshold. An evaluation of effectiveness in terms of QALY, rather than a clinical measure, would allow the comparison of incremental cost effectiveness across different areas of the disease and therefore consideration of the opportunity costs and added value to society [21]. However, the aim of this study was to evaluate the cost effectiveness of the PCa diagnostic process using STHLM3 as an alternative to PSA and because of a lack of data, it was not possible to include QALYs as an effectiveness outcome for this specific period.

To estimate correctly classified individuals, sensitivity and specificity of the different tests and procedures were important to identify. As this study had an a priori determined target condition, as well as positivity thresholds for the tests/procedures, and a defined population, the amount of evidence meeting these specific criteria was rather limited. For MRI and MRI-targeted biopsy, we included a Cochrane systematic review, which expectedly represents evidence of high quality. For PSA, we were unable to identify any studies that met the criteria, and thus the relative sensitivity and specificity of PSA and STHLM3 were calculated based on the prospective population-based diagnostic study of the STHLM3 [3]. In this study, the STHLM3 test was calibrated to obtain the same sensitivity as the PSA test; hence the calculated relative specificity of the PSA/STHLM3 test was the only factor varying between arms and thus the effectiveness results were driven by the difference in true-negative cases.

The original diagnostic study of the STHLM3 test [3] excluded biopsy results from men with PSA values > 10 ng/mL. Thus, available data did not support inclusion of men with PSA values > 10 ng/mL in the present study, which is a potential limitation, when considering clinical use of STHLM3 as an alternative to PSA, and not as a reflex test. The exclusion of men with PSA levels > 10 ng/mL might lead to the cost of STHLM3 being underestimated in the present study. The STHLM3 test is not currently implemented in routine clinical practice. However, if considering future implementation into clinical practice, the STHLM3 test has the analysis of the PSA level as an incorporated first component of the analysis. Thus, it is possible to perform the analysis of the PSA level as a first partial analysis, and then only perform the full STHLM3 test as a reflex test. Therefore, if adopted into future clinical practice, determination of the PSA cut-off for performing the full STHLM3 analysis is highly relevant and could significantly impact the costs. Different PSA cut-offs have been investigated and currently a lower cut-off of 1.5 ng/mL is considered [72]. Accordingly, men with PSA < 1.5 ng/mL would not have the full STHLM3 test and simply be scored as STHLM3 negative as they have a very low risk of significant PCa upon biopsy [72]. Performing a partial STHLM3 analysis (PSA analysis only) in a subset of patients (e.g., those with PSA < 1.5 ng/mL or > 10 ng/mL) would expectedly lower the cost of STHLM3. However, the true costs of adopting the STHLM3 test as an alternative to the PSA test in clinical practice would probably be higher than the results presented in this study, owing to the low cost of the current PSA test.

In this study, a STHLM3 threshold value for positivity of ≥ 10% was applied. While beyond the scope of the present study, we suggest future studies assess the influence of different STHLM3 cut-offs on the cost-effectiveness results. Conceivably, higher STHLM3 cut-off values would result in fewer false-positive STHLM3 findings and thereby reduce the costs of the STHLM3 arm. Findings from a recent clinical study by Nordstrom et al. 2021 support this argument, as they found that compared with PSA of 3 ng/mL or higher, a STHLM3 of 15% or higher provided identical sensitivity to detect clinically significant cancer, and led to fewer MRI procedures and fewer biopsy procedures [15]. However, we also note that the STHLM3 algorithm and underlying assay have been revised since the initiation of this study [73], thus the STHLM3 cut-off values are not directly comparable to those applied in this study.

The unavailability of suitable data is one of the major limitations of the present analysis, which also means that any correlations that may exist between the sensitivity and specificity data, for the range of diagnostic tests, have been ignored. However, although the model-based analysis was a simplified presentation of a complex clinical field, the data and results presented in this study may be informative for decision making at an early stage.

Sepsis is a well-known potential complication after transrectal biopsy. Thus, it was decided to include this specific complication in the model. The probability of sepsis was estimated using a data synthesis of 36 original studies, and because of differences in reporting, we decided to include cases reported as sepsis, urinary tract infection/sepsis, or severe infection requiring hospitalization when determining the probability of sepsis. This decision might lead to an overestimation of the risk, and thereby costs. However, the included probability of sepsis of 0.02 (95% CI 0.02–0.03) was relatively low compared with the most recent Danish hospitalization statistics after transrectal biopsies, which was 0.06 in 2020 at the national level [74]; this minimizes the likelihood of the probability being overestimated. Furthermore, the deterministic sensitivity analysis found that the probability of sepsis was of minor importance for the overall results.

The study used a hospital perspective with a macro-costing approach, applying Danish national tariffs for the different diagnostic procedures. A clear limitation following this approach is the lack of sensitivity to differences in the specific activities, and a micro-costing analysis might have added more detailed information. However, the use of national tariffs is transparent and makes the results more generalizable across settings, which was of greater relevance.

In terms of identifying the most clinically relevant and cost-effective PCa diagnostic strategy, additional research is needed. To reduce overdiagnosis, harm, and costs, future studies may address different protocols, for example, the newly developed risk adapted early PCa detection strategy from the European Association of Urology. In addition to cost-effectiveness analyses, studies assessing the budget impact of different screening strategies could provide useful information related to the financial consequences of adopting each strategy, which is highly relevant in a decision-making context.

5 Conclusions

Compared with using the PSA test as an initial test in PCa diagnostics, the STHLM3 test showed improved incremental effectiveness, however, at additional costs. The results were sensitive to the cost of the STHLM3 test; therefore, a lower cost of the STHLM3 test would improve its cost effectiveness compared with PSA.