Abstract
Purpose
Confocal Laser Endomicroscopy (CLE) is an imaging tool, that has demonstrated potential for intraoperative, real-time, non-invasive, microscopical assessment of surgical margins of oropharyngeal squamous cell carcinoma (OPSCC). However, interpreting CLE images remains challenging. This study investigates the application of OpenAI’s Generative Pretrained Transformer (GPT) 4.0 with Vision capabilities for automated classification of CLE images in OPSCC.
Methods
CLE Images of histological confirmed SCC or healthy mucosa from a database of 12 809 CLE images from 5 patients with OPSCC were retrieved and anonymized. Using a training data set of 16 images, a validation set of 139 images, comprising SCC (83 images, 59.7%) and healthy normal mucosa (56 images, 40.3%) was classified using the application programming interface (API) of GPT4.0. The same set of images was also classified by CLE experts (two surgeons and one pathologist), who were blinded to the histology. Diagnostic metrics, the reliability of GPT and inter-rater reliability were assessed.
Results
Overall accuracy of the GPT model was 71.2%, the intra-rater agreement was κ = 0.837, indicating an almost perfect agreement across the three runs of GPT-generated results. Human experts achieved an accuracy of 88.5% with a substantial level of agreement (κ = 0.773).
Conclusions
Though limited to a specific clinical framework, patient and image set, this study sheds light on some previously unexplored diagnostic capabilities of large language models using few-shot prompting. It suggests the model`s ability to extrapolate information and classify CLE images with minimal example data. Whether future versions of the model can achieve clinically relevant diagnostic accuracy, especially in uncurated data sets, remains to be investigated.
Similar content being viewed by others
Data availability
Data are available upon reasonable request from the corresponding author.
References
Li J, Zhuo F, Wang X et al (2023) Clinical data, survival, and prognosis of 426 cases of oropharyngeal cancer: a retrospective analysis. Clin Oral Invest 27:6597–6606. https://doi.org/10.1007/s00784-023-05265-y
Nichols AC, Theurer J, Prisman E, Read N, Berthelet E, Tran E, Fung K, de Almeida JR, Bayley A, Goldstein DP, Hier M, Sultanem K, Richardson K, Mlynarek A, Krishnan S, Le H, Yoo J, MacNeil SD, Winquist E, Hammond JA, Venkatesan V, Kuruvilla S, Warner A, Mitchell S, Chen J, Corsten M, Johnson-Obaseki S, Odell M, Parker C, Wehrli B, Kwan K, Palma DA (2022) Randomized trial of radiotherapy versus transoral robotic surgery for oropharyngeal squamous cell carcinoma: long-term results of the ORATOR trial. J Clin Oncol 40(8):866–875. https://doi.org/10.1200/JCO.21.01961. (Epub 2022 Jan 7 PMID: 34995124)
Grégoire V, Nicolai P (2019) Choosing surgery or radiotherapy for oropharyngeal squamous cell carcinoma: is the issue definitely settled? Lancet Oncol 20(10):1328–1329. https://doi.org/10.1016/S1470-2045(19)30495-4. (Epub 2019 Aug 12. PMID: 31416686)
Arboleda LPA, de Carvalho GB, Santos-Silva AR, Fernandes GA, Vartanian JG, Conway DI, Virani S, Brennan P, Kowalski LP, Curado MP (2023) Squamous cell carcinoma of the oral cavity, oropharynx, and larynx: a scoping review of treatment guidelines worldwide. Cancers (Basel) 15(17):4405. https://doi.org/10.3390/cancers15174405. (PMID:37686681;PMCID:PMC10486835)
Gorphe P, Simon C (2019) A systematic review and meta-analysis of margins in transoral surgery for oropharyngeal carcinoma. Oral Oncol 98:69–77. https://doi.org/10.1016/j.oraloncology.2019.09.017. (Epub 2019 Sep 20 PMID: 31546183)
Urken ML, Yun J, Saturno MP, Greenberg LA, Chai RL, Sharif K, Brandwein-Weber M (2023) Frozen section analysis in head and neck surgical pathology: a narrative review of the past, present, and future of intraoperative pathologic consultation. Oral Oncol 143:106445. https://doi.org/10.1016/j.oraloncology.2023.106445. (Epub 2023 Jun 6 PMID: 37285683)
Sievert M, Stelzle F, Aubreville M, Mueller SK, Eckstein M, Oetter N, Maier A, Mantsopoulos K, Iro H, Goncalves M (2021) Intraoperative free margins assessment of oropharyngeal squamous cell carcinoma with confocal laser endomicroscopy: a pilot study. Eur Arch Otorhinolaryngol 278(11):4433–4439. https://doi.org/10.1007/s00405-021-06659-y. (Epub 2021 Feb 13. PMID: 33582849; PMCID: PMC8486707)
Tan J, Ji HL, Hu YW, Li ZM, Zhuang BX, Deng HJ, Wang YN, Zheng JX, Jiang W, Yan J (2022) Real-time in vivo distal margin selection using confocal laser endomicroscopy in transanal total mesorectal excision for rectal cancer. World J Gastrointest Surg 14(12):1375–1386. https://doi.org/10.4240/wjgs.v14.i12.1375. (PMID:36632126;PMCID:PMC9827574)
Sievert M, Oetter N, Aubreville M, Stelzle F, Maier A, Eckstein M, Mantsopoulos K, Gostian AO, Mueller SK, Koch M, Agaimy A, Iro H, Goncalves M (2021) Feasibility of intraoperative assessment of safe surgical margins during laryngectomy with confocal laser endomicroscopy: a pilot study. Auris Nasus Larynx 48(4):764–769. https://doi.org/10.1016/j.anl.2021.01.005. (Epub 2021 Jan 16 PMID: 33468350)
Dolak W, Mesteri I, Asari R, Preusser M, Tribl B, Wrba F, Schoppmann SF, Hejna M, Trauner M, Häfner M, Püspök A (2015) A pilot study of the endomicroscopic assessment of tumor extension in Barrett’s esophagus-associated neoplasia before endoscopic resection. Endosc Int Open 3(1):19–28. https://doi.org/10.1055/s-0034-1377935. (Epub 2014 Oct 24. PMID: 26134766; PMCID: PMC4423329)
Wenda N, Fruth K, Fisseler-Eckhoff A, Gosepath J (2023) The multifaceted role of confocal laser endomicroscopy in head and neck surgery: oncologic and functional insights. Diagnostics (Basel) 13(19):3081. https://doi.org/10.3390/diagnostics13193081. (PMID:37835824;PMCID:PMC10572220)
Wenda N, Kiesslich R, Gosepath J (2021) Technical note: first use of endonasal confocal laser endomicroscopy—feasibility and proof of concept. Int Arch Otorhinolaryngol 26(3):e396–e400. https://doi.org/10.1055/s-0041-1724091. (PMID:35846802;PMCID:PMC9282955)
Sievert M, Oetter N, Mantsopoulos K, Gostian AO, Mueller SK, Koch M, Balk M, Thimsen V, Stelzle F, Eckstein M, Iro H, Goncalves M (2022) Systematic classification of confocal laser endomicroscopy for the diagnosis of oral cavity carcinoma. Oral Oncol 132:105978. https://doi.org/10.1016/j.oraloncology.2022.105978. (Epub 2022 Jun 21 PMID: 35749803)
Aubreville M, Stoeve M, Oetter N, Goncalves M, Knipfer C, Neumann H, Bohr C, Stelzle F, Maier A (2019) Deep learning-based detection of motion artifacts in probe-based confocal laser endomicroscopy images. Int J Comput Assist Radiol Surg 14(1):31–42. https://doi.org/10.1007/s11548-018-1836-1. (Epub 2018 Aug 4 PMID: 30078151)
Pan Z, Breininger K, Aubreville M, Stelzle F, Oetter N, Maier A, Mantsopoulos K, Iro H, Goncalves M, Sievert M (2023) Defining a baseline identification of artifacts in confocal laser endomicroscopy in head and neck cancer imaging. Am J Otolaryngol 44(2):103779. https://doi.org/10.1016/j.amjoto.2022.103779. (Epub 2022 Dec 28. PMID: 36587604)
Mazurowski MA, Dong H, Gu H, Yang J, Konz N, Zhang Y (2023) Segment anything model for medical image analysis: an experimental study. Med Image Anal 89:102918
Temsah R, Altamimi I, Alhasan K, Temsah MH, Jamal A (2023) Healthcare’s new horizon with ChatGPT’s voice and vision capabilities: a leap beyond text. Cureus 15(10):e47469. https://doi.org/10.7759/cureus.47469. (PMID:37873042;PMCID:PMC10590619)
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174. https://doi.org/10.2307/2529310
Yu P, Xu H, Hu X, Deng C (2023) Leveraging generative AI and large language models: a comprehensive roadmap for healthcare integration. Healthcare (Basel) 11(20):2776. https://doi.org/10.3390/healthcare11202776. (PMID:37893850;PMCID:PMC10606429)
Preiksaitis C, Rose C (2023) Opportunities, challenges, and future directions of generative artificial intelligence in medical education: scoping review. JMIR Med Educ 20(9):e48785. https://doi.org/10.2196/48785. (PMID:37862079;PMCID:PMC10625095)
Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, Chen W (2021) Lora: low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685
Liu H, Tam D, Muqeeth M, Mohta J, Huang T, Bansal M, Raffel CA (2022) Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. Adv Neural Inf Process Syst 35:1950–1965
Rao A, Pang M, Kim J, Kamineni M, Lie W, Prasad AK et al (2023) Assessing the utility of ChatGPT throughout the entire clinical workflow. MedRxiv Prepr Serv Heal Sci. https://doi.org/10.1101/2023.02.21.23285886
Chee J, Dawn E, Goh X (2023) “Vertigo, likely peripheral”: the dizzying rise of ChatGPT. Eur Arch Oto-Rhino-Laryngol. https://doi.org/10.1007/s00405-023-08135-1
Ayers JW, Poliak A, Dredze M, Leas EC, Zhu Z, Kelley JB et al (2023) Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med 183:589–596. https://doi.org/10.1001/jamainternmed.2023.1838
Azamfirei R, Kudchadkar SR, Fackler J (2023) Large language models and the perils of their hallucinations. Crit Care 27:120. https://doi.org/10.1186/s13054-023-04393-x
Liu H, Li C, Wu Q, Lee YJ (2023) Visual instruction tuning. Proceedings of NeurIPS 2023
Funding
This project was supported by the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft), Grant Number 3182/2-1, Project Number 439264659.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None of the authors has any personal conflict of interest to declare.
Ethical approval
All procedures performed in this study involving human participants complied with the ethical standards of the institutional and/or national research committee (approval number 60_14 B) and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.
Informed consent
No participant consent for publication is necessary.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sievert, M., Aubreville, M., Mueller, S.K. et al. Diagnosis of malignancy in oropharyngeal confocal laser endomicroscopy using GPT 4.0 with vision. Eur Arch Otorhinolaryngol 281, 2115–2122 (2024). https://doi.org/10.1007/s00405-024-08476-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00405-024-08476-5