Digital decision support for structural improvement of melanoma tumor boards: using standard cases to optimize workflow

Purpose Choosing optimal cancer treatment is challenging, and certified cancer centers must present all patients in multidisciplinary tumor boards (MDT). Our aim was to develop a decision support system (DSS) to provide treatment recommendations for apparently simple cases already at conference registration and to classify these as “standard cases”. According to certification requirements, discussion of standard cases is optional and would thus allow more time for complex cases. Methods We created a smartphone query that simulated a tumor conference registration and requested all information needed to provide a recommendation. In total, 111 out of 705 malignant melanoma cases discussed at a skin cancer center from 2017 to 2020 were identified as potential standard cases, for which a digital twin recommendation was then generated by DSS. Results The system provided reliable advice in all 111 cases and showed 97% concordance of MDT and DSS for therapeutic recommendations, regardless of tumor stage. Discrepancies included two cases (2%) where DSS advised discussions at MDT and one case (1%) with deviating recommendation due to advanced patient age. Conclusions Our work aimed not to replace clinical expertise but to alleviate MDT workload and enhance focus on complex cases. Overall, our DSS proved to be a suitable tool for identifying standard cases as such, providing correct treatment recommendations, and thus reducing the time burden of tumor conferences in favor for the comprehensive discussion of complex cases. The aim is to implement the DSS in routine tumor board software for further qualitative assessment of its impact on oncological care.


Introduction
Everyday dermato-oncologists face the challenge to ensure evidence-based best clinical practice treatment for their cancer patients.Continuously updated treatment recommendations lead to an almost incomprehensible number of treatment options (Iqvia 2022).In malignant melanoma, we are faced with a rapidly changing therapeutic landscape in which immunotherapeutics and targeted treatment options are the gold standard in palliative and curative treatment (DKG 2020).Other potentially new emerging fields are neoadjuvant treatment approaches and adapted surgical procedures at earlier cancer stages (Amaria 2022;Luke 2022).
However, there are often deviations from the treatment considered best clinical practice and it must be assumed that a significant proportion of cancer patients do not receive optimal care (Bierbaum 2020;Bierbaum 2023;Heins 2017).An evaluation of the National Cancer Database (USA) on melanoma treatment showed, for example, that approx.20% of patients with T2/3 melanoma were not treated according to the guidelines (Narang 2021).
In order to improve the quality of oncological care in Germany, the certification of cancer centers was implemented and has since become increasingly mandatory.Current data shows that treatment in certified cancer centers leads to higher treatment quality and better survival David Hoier, Philipp Koll have contributed equally to this work.
A central feature for certification as a cancer center by the German Cancer Society (DKG) is the presentation of all treated melanoma patients from Stage IIC in a multidisciplinary tumor board (MDT) (Kowalski 2017).At the skin cancer center Cologne all melanoma cases from stage IIB have to be presented to MDT.
However, the MDT presentation of all tumor patients due to certification requirements led to a significant quantitative increase in case discussions.Studies have shown that this does not automatically translate to an increase in quality (Soukup 2016;Walraven 2019).
A large number of cases to be discussed pose a challenge to the participants in these boards (Soukup 2016;Walraven 2019).Data evaluating the extent to which MDT improves outcomes remain mixed and some studies have found a survival benefit, while others have found no such advantage.A 2019 analysis indicated that the 5 year survival rate was 15.6% higher among cases in wellorganized MDT but almost 20% lower in disorganized MDT compared with no MDT (Keating 2013;Kesson 2012;Lu 2019;Stone 2020;Wong 2022).
In the end, the quality of MDT's depends on the expertise and motivation of the participants and the time available to discuss the individual cases (Jalil 2018).
Decision support systems (DSS) could provide useful support here and appear to offer remarkable potential (Chen 2016;Letzen 2019;Soukup 2019).Surprisingly, however, artificial intelligence (AI)-based DSS have so far failed to gain acceptance in the field of clinical oncology (Bungartz 2018).This is even the case for standard oncological questions regarding first-line therapy (Schmidt 2017).
Expert-curated DSS algorithms, developed on knowledge base provided by oncological professionals, might be a superior approach for adequate decision support.As described in Nature Biotechnology in 2018, these expert-curated DSS provide multiple advantages when compared with AI-based systems (Bungartz 2018).Most importantly, expert-curated systems seem to represent clinical reality better than the AI-based systems used so far.
The aim of the present study was to evaluate the potential benefit of a DSS to optimize the workflow of tumor conferences.

Materials and methods
We first created a query that simulated a melanoma tumor conference registration and requested all the information needed to provide a treatment recommendation.This algorithm was then implemented in the established oncology smartphone application "EasyOncology" (EO).This certified medical product is aimed at specialized personnel and provides the treatment concepts for the most common tumor entities.Thus, the basis was provided to easily match the smartphone DSS recommendations with the real decisions of the MDT and to determine the quality of the concordance.
To evaluate the reliability of each DSS recommendation, we followed the same approach that was used to assess the accuracy of AI-based DSS and benchmarked the digital treatment recommendations against real-world MDT decisions (Choi 2019;Kim 2019;Lee 2018;Somashekhar 2018;Yu 2021;Zhou 2019).For this purpose, we determined the concordance rates of diagnostic and therapeutic recommendations for newly diagnosed cases proposed by our DSS and a certified MDT for patients with cutaneous malignant melanoma.

Smartphone application
The smartphone application EasyOncology (EO) was developed by clinically experienced oncologists and is intended to provide evidence-based diagnostic and therapeutic recommendations for common solid cancer entities.EO's oncological treatment algorithm "therapy finder" is based on a decision tree, which was developed through a systematic process with clinical experts in oncology across diverse institutions.More precisely, EO's platform is based on current oncological guidelines, (e.g., S3-guidelines (Dkg 2020) and NCCN guidelines (Swetter 2021)), drug approval status, current publications of relevant studies, and best clinical practice from leading German cancer centers.
Frequent testing and challenging of the algorithm with real-world test cases enables identification of practice changing medical standards with subsequent corresponding adjustment of the query.Finally, frequent version updates ensure to display the latest advancements in the field of dermato-oncology.
EO was ranked top three in a worldwide comparison of 157 oncological applications in 2017 and was certified as a medical device in 2020 (Calero 2017).The software is a CE marked medical device and subject to the according regulations to ensure its security and reliability.The software relies on anonymous input without exchange of identifiable patient information via hospital intranet.
In the present work, EO's therapy finder (version 5.06) was used to generate first-line diagnostic and therapeutic recommendations for patients with all stages of newly diagnosed cutaneous malignant melanoma.This software version displayed the 8th edition of the AJCC classification for melanoma.A graphic illustration of the App interface of EasyOncology's therapy finder is depicted in Fig. 1.
The DSS query algorithm of EO's therapy finder requests clinicopathologic data to generate treatment recommendations in a stepwise fashion.
The variables requested by the DSS included Breslow thickness; the presence of an ulceration (pT-stage); the histopathologic results of the sentinel lymph node biopsy (SLNB); radiologic staging information; the histopathologic results of the complete lymph node dissection (CLND); the postresection residual cancer status (R0, R1, and R2); the clinical and pathological evaluation of in-transit and satellite metastases; and the presence of important driver gene mutations (i.e., BRAF, NRAS, and cKIT).
The number of input variables necessary to generate a treatment recommendation depended on the complexity of each case.For simple cases (i.e., early-stage localized malignant melanoma) merely two variables are needed for DSS output, whereas more complex cases required up to five clinicopathologic variables.A simplified graphic illustration of how EO's therapy finder-based DSS generates treatment recommendations is depicted in Fig. 2.

Definition of standard cases and study inclusion criteria
According to certification criteria, all treatment cases from stage IIC upward must be presented in MDT and given recommendations are to be documented.Of note, those cases whose treatment concepts can be decided without extensive discussion can already be defined as "standard cases" when registering for the conference and thus flagged in the conference protocol.In this context, clear guideline cases, e.g., early stages that do not require extensive interdisciplinary discussion, are considered as standard cases.Since a discussion of these standard cases is not mandatory, they do not consume any time of the actual conference.
In clinical routine, these possible standard cases are often not recognized at the time of registration, for example because the registering clinician does not have sufficient clinical experience.Thus, an increase in the proportion of standard cases that were pre-answered by a DSS could relieve the tumor conference accordingly.Relevant information is requested by EO's query algorithm to enable a treatment recommendation.In this example of primary disease, the pT-stage is at first requested.Subsequently, the algorithm query requires the Breslow thickness, which, in this example, leads to the treatment recommendation for stage IA melanoma In accordance with the specifications for automatic definition as a standard case, only cases without complicating factors were included in the analysis (Fig 3 .).This procedure also corresponds to our approach that complicated cases should of course be discussed in the tumor board and by no means just decided digitally.
Cases with non-cutaneous melanoma (i.e., mucosal melanoma, uveal melanoma, or melanoma of unknown origin), as well as patients with relapsed disease, with secondary cancers, severe comorbidities, patients treated in clinical trials, or patients who explicitly declined diagnostic procedures (i.e., SLNB) were excluded from evaluation.
By intention, cases with brain metastases were also excluded from analysis, as stereo tactical or neurosurgical procedures should not be declared as standard cases.Lastly, we excluded those cases that lacked relevant clinicopathologic data needed as input variables by EO's therapy finder.

Patient selection and study design
Ethical approval to conduct this work was granted by the Ethics Committee of the Medical Faculty of the University of Cologne (#20-1116).
The retrospective MDT dataset evaluation initially included 2399 cases with malignant diseases of the skin who received treatment at the Department of Dermatology and Venereology, University of Cologne, between January 2017 and December 2020.
As depicted in the study inclusion flowchart (Fig 3), we first excluded all non-melanoma cases.In total, 705 melanoma cases remained that were screened for eligibility, of which 594 patient cases presented to the MDT were not suitable for analysis with EO.
Finally, MDT treatment recommendations of 111 cases that fulfilled our pre-selection criteria for standard cases were included for comparison.Hereafter, the clinical information that remained was used to generate a DSS treatment recommendation.
Treatment recommendations for each case given by MDT or DSS were compiled as response pairs.Subsequently, after blinding of decision origin, each response pair was assessed for obvious discrepancies in recommendations and accordingly classified as "concordant" or "incorrect recommendations." In a second independent review, an experienced dermatooncologist analyzed each non-concordant decision pair for their quality of decisions and sub-grouped them in three categories, similar to previous publications evaluating the DSS Watson for Oncology by IBM (Choi 2019 In fact, if using comparative cases from the past, the DSS will provide an updated treatment recommendation, thus leading to a concordance rate that only reflects the Therefore, we assumed historic MDT treatment decisions to be correct and decided to classify this time-dependent deviation in the same group as those more actual cases with correct alternative recommendation according to best clinical standard.
As an example, for the group of "correct alternative recommendation" cases, our DSS recommended adjuvant therapy based on the mutational status, whereas, prior to 2018, MDT issued a recommendation for adjuvant interferon.
Finally, "incorrect recommendations" cases were analyzed in detail to identify potential algorithm query errors.The evaluation process is depicted in Fig. 4.

Data analysis and statistics
Descriptive statistics and data analysis were carried out using IBM's statistics software SPSS version 28 and Microsoft Excel.Descriptive statistics were depicted as numbers, percentages, or median.In line with previous publications, the concordance rate was presented as a percentage agreement between DSS and MDT, i.e.,

Descriptive statistics
Clinicopathologic characteristics of 111 malignant melanoma patients fulfilling our predefined standard case inclusion criteria for the determination of concordance are depicted in Table 1.

Concordance rates
In decisions regarding the optimal first-line treatment for patients with malignant melanoma, the overall concordance rate between recommendations proposed by our DSS and those given by MDT was 97%.This includes 87 (78%) "concordant" cases and 21 (19%) "correct alternative recommendation" cases (Fig. 5a).Treatment concordance rates according to malignant melanoma stages, i.e., I, II, III, and IV, were 100%, 95%, 98%, and 100%, respectively (Fig. 5b).Quality of concordance was independent of age, melanoma stage, histologic subtype, gene mutation status, and complete lymph node dissection.

Non-concordant cases
As requested by protocol, the 3 "incorrect" cases were analyzed to identify potential systematic errors caused by our DSS decision algorithm.This independent review process was performed by an experienced dermato-oncologist.Two of these non-concordant cases with high-risk melanoma Fig. 4 Flowchart for evaluating decision concordance of recommendations given by MDT or DSS.A first evaluation compared DSS and MDT treatment recommendations for concordance.Discordant recommendations were blinded to their origin and analyzed in detail by an dermato-oncologist, who categorized each recommendation either as "concordant recommendation," "correct alternative recommendation" or as "incorrect recommendation" (pT4b) showed either suspicious cervical lymph nodes or a solitary pulmonary nodule in CT imaging after primary resection.Because these findings remained unclear from a diagnostic perspective, the DSS advised that both cases need to be discussed in the MDT to find an individualized solution and no therapeutic recommendation was provided.
The third "incorrect" case was a 80-year-old patient who presented with cutaneous satellite metastases.In accordance with the S3 guidelines for malignant melanoma (Dkg 2020), the DSS recommended a complete surgical resection of satellite metastases and adjuvant therapy.In contrast, MDT recommended an oncolytic viral immunotherapy with talimogene laherparepvec (T-VEC).According to the reviewing dermato-oncologist, this therapy was correctly recommended by MDT due to the advanced age of the patient and the difficult resection site of the melanoma.Even though our DSS provided the correct stage-specific therapy recommendation, its recommendation was not checked for its applicability due to patient-specific factors.

Discussion
Treatment in certified tumor centers undeniably improves the quality of oncological care, and MDTs are one of the most important quality key features.In fact, however, it is relatively unclear to what extent the tumor boards have contributed to the improvement in survival rates in the certified centers (Devitt 2013;Keating 2013;Krasna 2013;Soukup 2019;Specchia 2020).
The certification requirement to present the majority of tumor cases in MDTs has led to a noticeable increase in the number of cases to be discussed in the conferences, which could have an unfavorable effect on the quality of recommendations (Soukup 2016;Walraven 2019).
Data evaluating the extent to which MDT improve outcomes remain mixed and some studies have found a survival benefit, while others have found no such advantage.A 2019 analysis indicated that the 5 year survival rate was 15.6% higher among cases in well-organized MDT but almost 20% lower in disorganized MDT compared with no MDT (Keating 2013;Kesson 2012;Lu 2019;Stone 2020;Wong 2022).
It seems almost surprising that AI-based systems still have not become established to support decision making Fig. 5 Treatment concordance between DSS and MDT. a Overall treatment concordance rates between the therapeutic recommendation given by the MDT and the treatment recommendation given by DSS for malignant melanoma.Overall concordance was 97%.b Treatment concordance rates according to malignant melanoma stages I, II, III, and IV were 100%, 95%, 98%, and 100%, respectively of MDT in routine clinical oncology.However, previous attempts to provide standardized treatment recommendations for the first-line treatment of tumor diseases using AI-based applications have shown too much uncertainty compared to the expertise of experienced oncologists.In several studies, for example, the use of Watson for Oncology could only achieve agreement rates of 12% to 93% in direct comparison with real MDT decisions (Zhou 2019;Choi 2019;Kim 2019;Lee 2018;Somashekhar 2018;Seidman 2015).
These previous AI approaches have been tested in many different tumor entities, but experience in decision support for melanoma therapy has not yet been published.
One important reason for the poor performance of AI systems is the lack of high-quality training datasets.They are available en masse as standardized data for AI systems used in image recognition, but not as regular cases in oncology care (McKinney et al. 2020;Ardila et al. 2019;Rubin 2019;Esteva et al. 2017).
In addition to the lack of well-organized and verified training data, another problem is the limited resource of experts who are initially required for the human interpretation and evaluation of the AI results.
An optimized workflow of the tumor conference is another crucial criterion for effectiveness.As requirements for an optimally structured conference, all information that is necessary for a therapy decision should first be available.In addition, those cases that require greater concentration due to their complexity should ideally be in the focus of the conference.Those medically simple cases for which clear recommendations can easily be derived from the guidelines should ideally be noted in the protocol as standard cases and only discussed optionally.
Here we see digital decision support systems as a suitable tool for structural improvement.By issuing guideline-compliant therapy recommendations at the time of conference registration, standard cases could be defined, and the conference could be unburdened accordingly.
In this respect, we see malignant melanoma as a suitable model disease for the development of decision support for tumor boards, as the melanoma treatment algorithm is considered comparatively complex.
Our dataset included 2399 cases with malignant diseases of the skin, of which 705 melanoma cases were discussed in the tumor board of the Skin Cancer Center of the University Hospital Cologne in the period from 2017 to 2020.This fits in well with the DKG annual reports of the certified skin cancer centers, which for example shows a melanoma proportion of 22.2% of all presented skin cancers for the year 2022 (Dkg 2022).
The fact that only 111 of the 705 cases with malignant melanoma could be included in the evaluation of our work is mainly due to the high proportion of recurrent diseases (377), non-cutaneous melanomas (78), and very complex disease patterns (43) that obviously cannot be declared as standard cases (see flowchart of patient case exclusion process, Fig 3).In addition, no therapy recommendation could be automatically derived in 72 cases due to insufficient information relevant for decision making.
As a result, we found three cases with divergent standard recommendations.Of these, two cases were rated as noncompliant in accordance with the protocol, as no automated recommendation was made at all, but rather a presentation to the tumor board was correctly recommended.
An important finding was the identification of a case in which the recommendation deviated due to the patient's advanced age.This implies to adapt the query algorithm to assess age in these clinical constellations so that systemic treatment can be critically discussed with the patient, especially in adjuvant treatment.
The present work has several limitations that should be considered.
As the first limitation, we only selected first-line cases and thus missed a large number of discussions on relapsed cases with first metastatic disease.In addition, certification requirements only request the presentation of melanoma cases in stages IIC and higher, leading to a low number of stage I melanoma cases.
As we included cases that were presented to MDT for the first time during the clinical course, there were few cases with initial metastatic disease.Thus, concordance rates for stage I and stage IV were calculated based on a small number of patient cases (3 and 4 patient cases, respectively).For these stages, no general conclusions can be drawn and further validation is necessary.
The second limitation derives from our strict inclusion criteria, which intended a preselection of cases that were most likely standard cases.As a consequence only 16 % of all cases were selected for detailed comparison, most of them stage II and III.If the high proportion of up to 50% stage II and III melanoma cases is taken into account, which displays the reality of MDT in the annual DKG reports, the potential to unburden MDT by DSS becomes more evident.Our evaluation also indicates 10% of cases (n = 72) for which the DSS was unable to issue a recommendation due to missing information.If treatment-relevant information would be obtained more systematically by a DSS at the time of conference registration, a further reduction in the burden could thus be easily derived.
As a third limitation, S3 guidelines for malignant melanoma were updated four times (DKG 2020) and numerous new therapeutics received approval during study period between 2017 and 2020.These changes in the guidelines led to deviations in the recommendations, for example on the role of interferon therapy.Of note, the presented data are based on a software (EO version 5.06) that displayed the 8th edition of the AJCC classification for melanoma and remained unchanged during the study period.Changes from 7 th to 8 th version in 2018 did not affect cases included in this study.
The development of the DSS in collaboration with the skin cancer center of the University Hospital Cologne could indicate a performance bias and the risk of overfitting.In further studies validation has to be performed by a multicentric approach.However, it is important to emphasize that the decision logic is based on the S3 guideline for malignant melanoma (Dkg 2020) and the approval status of new therapeutics that are used in best clinical practice.The continuous comparison of real world to digital decisions ensures that new treatment concepts are quickly detected, and thus, enables EO's expert curators to integrate the latest medical standard and to appropriately adjust the query algorithm.
As a statistical limitation, the research methodology used to determine concordance rates is only of descriptive nature and describes the degree of agreement between DSS and MDT.At this time, no conclusion of the clinical benefit using DSS, such as overall survival or progression free survival of the patients, can be drawn from this approach.
The advantages we show with this work could not only provide substantial support in everyday clinical practice, but also provide a for the later integration of AI-based systems.
As a first advantage, the proposed principle allows an automatic structuring of the tumor conferences according to complexity and enables a quality assurance of the recommendations given by automatic comparison with the guidelines of the medical societies.
Second, the system ensures that all information necessary for a therapeutic decision is already requested at the time of registration for the conference.Otherwise, missing information often leads to postponement or only very vague recommendations such as "indication for systemic therapy." The repetitive questioning of decision-relevant information that occurs with repeated use leads to the third advantage, the teaching effect.This could also be reinforced by the sense of satisfaction that comes from having received a (digital) treatment recommendation.
Not least, another advantage is that the type of recommendation matching described above generates training data for the future integration of AI systems.As said before, a machine learning tool can only ever be as good as the data available for training and the trainers who evaluate the AI results.

Summary
It must be underlined again that the aim of our work is not to provide digital recommendations for all questions addressed to multidisciplinary tumor boards or to replace clinical experience.However, we see the need to relieve the time burden of these critical conferences so that the participants can focus their expertise on the more complex tumor cases.Our results suggest that this automated approach would allow a more concentrated and detailed discussion of complex tumor cases, on which the valuable expertise of the board members should be focused.In addition, the implementation of the presented algorithm in the routine software of tumor boards could provide the basis for transparent and comparable quality management.

Perspective
The principle of digital decision support described in this paper, whose algorithmic query is based on a decision tree, may seem unspectacular at first glance against the background of current AI development.However, it is currently the most suitable way to make recommendations for clinical situations that are already defined by clear therapeutic standards and guidelines.These commonly accepted therapeutic recommendations are based, among other things, on the approval status and availability of the therapeutic agents, as well as some clinical and structural aspects which are not evidence-based per se.It should therefore come as no surprise that the recommendations made by AI-based systems for well-defined standard clinical situations often do not correspond to standard practice.At this point, it should be considered that the AI recommendations could well be a better therapeutic choice, though this cannot be proven due to the AI black box effect described above.
Accordingly, it is to be expected that AI applications will initially realize their full potential in complex clinical constellations of advanced cancer diseases in which clinical standards do not exist.
However, this requires the availability of sufficient training data generated under the requirements of a defined healthcare system in routine care.This is precisely the data that are currently lacking, however, training data from other countries and healthcare systems cannot be used without hesitation.
The intended integration of query algorithms into the organizational software of tumor boards offers an important cornerstone here.The data collected on standard situations in oncological care can serve as the necessary training data for the planned integration of suitable AI models.
However, there is still a long way to go before AI delivers such impressive results in decision making as we are currently seeing with image-supported AI systems.

Fig. 1
Fig.1App interface of EasyOncology's therapy finder: Stepwise diagnostic query for malignant melanoma treatment recommendation.Relevant information is requested by EO's query algorithm to enable a treatment recommendation.In this example of primary disease,

Fig. 2
Fig. 2 Query algorithm of EasyOncology's therapy finder-based DSS.Depending on the selected initial diagnosis relevant diagnostic steps are requested by EO's query algorithm until a treatment recommendation is given.Abbreviations: a: including satellite and in-

Fig. 3
Fig. 3 Flowchart of patient case exclusion process

Table 1
NA not available (not relevant or" pending" for EO's query); R0 no residual cancer; R1 macroscopic residual cancer removed, while margins remain positive for microscopic residual cancer.Bold type numbers indicate statistical significance **denotes that the p-value is significant at 1% level, and * that the p-value is significant at 5% level