Abstract
Background
ChatGPT is a free artificial intelligence (AI) language model developed and released by OpenAI in late 2022. This study aimed to evaluate the performance of ChatGPT to accurately answer clinical questions (CQs) on the Guideline for the Management of Blepharoptosis published by the American Society of Plastic Surgeons (ASPS) in 2022.
Methods
CQs in the guideline were used as question sources in both English and Japanese. For each question, ChatGPT provided answers for CQs, evidence quality, recommendation strength, reference match, and answered word counts. We compared the performance of ChatGPT in each component between English and Japanese queries.
Results
A total of 11 questions were included in the final analysis, and ChatGPT answered 61.3% of these correctly. ChatGPT demonstrated a higher accuracy rate in English answers for CQs compared to Japanese answers for CQs (76.4% versus 46.4%; p = 0.004) and word counts (123 words versus 35.9 words; p = 0.004). No statistical differences were noted for evidence quality, recommendation strength, and reference match. A total of 697 references were proposed, but only 216 of them (31.0%) existed.
Conclusions
ChatGPT demonstrates potential as an adjunctive tool in the management of blepharoptosis. However, it is crucial to recognize that the existing AI model has distinct limitations, and its primary role should be to complement the expertise of medical professionals.
Level of Evidence V
Observational study under respected authorities. This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266.
Similar content being viewed by others
References
Stokel-Walker C (2022) AI bot ChatGPT writes smart essays: Should professors worry? Nature. https://doi.org/10.1038/d41586-022-04397-7
Else H (2023) Abstracts written by ChatGPT fool scientists. Nature 613(7944):423. https://doi.org/10.1038/d41586-023-00056-7
Kang Y, Xia Z, Zhu L (2023) When ChatGPT meets plastic surgeons. Aesthetic Plast Surg. https://doi.org/10.1007/s00266-023-03372-5
Topol E (2023) When M.D. is a machine doctor. Ground Truths. https://erictopol.substack.com/p/when-md-is-a-machine-doctor. Accessed July 20, 2023
Gupta R, Pande P, Herzog I et al (2023) Application of ChatGPT in cosmetic plastic surgery: ally or antagonist? Aesthet Surg J 43(7):NP587-NP 590. https://doi.org/10.1093/asj/sjad042
Gupta R, Herzog I, Weisberger J, Chao J, Chaiyasate K, Lee ES (2023) Utilization of ChatGPT for plastic surgery research: friend or foe? J Plast Reconstr Aesthet Surg 80:145–147. https://doi.org/10.1016/j.bjps.2023.03.004
Seth I, Cox A, Xie Y et al (2023) Evaluating Chatbot efficacy for answering frequently asked questions in plastic surgery: a ChatGPT case study focused on breast augmentation. Aesthet Surg J 43:1126–1135
Lanzano G (2023) Harnessing the potential of ChatGPT in breast reconstruction: a revolution in patient communication and education. Aesthetic Plast Surg. https://doi.org/10.1007/s00266-023-03427-7
Xie Y, Seth I, Rozen WM, Hunter-Smith DJ (2023) Evaluation of the artificial intelligence chatbot on breast reconstruction and its efficacy in surgical research: a case study. Aesthetic Plast Surg. https://doi.org/10.1007/s00266-023-03443-7
Li Y, Guo X, Gong Z (2023) Patient Safety advisory-breast implant removal and capsulectomy. Aesthetic Plast Surg. https://doi.org/10.1007/s00266-023-03524-7
Kenig N, Monton Echeverria J, Chang Azancot L, De la Ossa L (2023) A novel artificial intelligence model for symmetry evaluation in breast cancer patients. Aesthetic Plast Surg. https://doi.org/10.1007/s00266-023-03554-1
Abi-Rafeh J, Hanna S, Bassiri-Tehrani B, Kazan R, Nahai F (2023) Complications following facelift and neck lift: implementation and assessment of large language model and artificial intelligence (ChatGPT) performance across 16 simulated patient presentations. Aesthetic Plast Surg. https://doi.org/10.1007/s00266-023-03538-1
Xie Y, Seth I, Hunter-Smith DJ, Rozen WM, Ross R, Lee M (2023) Aesthetic surgery advice and counseling from artificial intelligence: a rhinoplasty consultation with ChatGPT. Aesthetic Plast Surg. https://doi.org/10.1007/s00266-023-03338-7
Li W, Chen J, Chen F, Liang J, Yu H (2023) Exploring the potential of ChatGPT-4 in responding to common questions about abdominoplasty: an AI-based case study of a plastic surgery consultation. Aesthetic Plast Surg. https://doi.org/10.1007/s00266-023-03660-0
Shiraishi M, Lee H, Kanayama K, Moriwaki Y, Okazaki M (2024) Appropriateness of artificial intelligence chatbots in diabetic foot ulcer management. Int J Low Extrem Wounds. https://doi.org/10.1177/15347346241236811
Young JN, O’Hagan R, Poplausky D, Levoska MA, Gulati N, Ungar B, Ungar J (2023) The utility of ChatGPT in generating patient-facing and clinical responses for melanoma. J Am Acad Dermatol 89(3):602–604
Shiraishi M, Kanayama K, Yang R, Okazaki M (2023) Preliminary evaluation of the potential of commercially available large language models in diagnosing skin tumours. Clin Exp Dermatol. https://doi.org/10.1093/ced/llad430
Trager MH, Queen D, Bordone LA, Geskin LJ, Samie FH (2023) Assessing ChatGPT responses to common patient queries regarding basal cell carcinoma. Arch Dermatol Res 315(10):2979–2981
Cox A, Seth I, Xie Y, Hunter-Smith DJ, Rozen WM (2023) Utilizing ChatGPT-4 for providing medical information on blepharoplasties to patients. Aesthet Surg J 43(8):NP658–NP662. https://doi.org/10.1093/asj/sjad096
Shiraishi M, Tomioka Y, Miyakuni A, Moriwaki Y, Yang R, Oba J, Okazaki M (2023) Generating informed consent documents related to blepharoplasty using ChatGPT. Ophthalmic Plast Reconstr Surg. https://doi.org/10.1097/IOP.0000000000002574
Graber ML, Franklin N, Gordon R (2005) Diagnostic error in internal medicine. Arch Intern Med 165(13):1493–1499. https://doi.org/10.1001/archinte.165.13.1493
Sarraju A, Bruemmer D, Van Iterson E, Cho L, Rodriguez F, Laffin L (2023) Appropriateness of cardiovascular disease prevention recommendations obtained from a popular online chat-based artificial intelligence model. JAMA 329(10):842–844. https://doi.org/10.1001/jama.2023.1044
Fergus S, Botha M, Ostovar M (2023) Evaluating academic answers generated using ChatGPT. J Chem Educ 100(4):1672–1675. https://doi.org/10.1021/acs.jchemed.3c00087
Kitamura FC (2023) ChatGPT is shaping the future of medical writing but still requires human judgment. Radiology 307(2):e230171. https://doi.org/10.1148/radiol.230171
Bacharach J, Lee WW, Harrison AR, Freddo TF (2021) A review of acquired blepharoptosis: prevalence, diagnosis, and current treatment options. Eye 35(9):2468–2481. https://doi.org/10.1038/s41433-021-01547-5
Iida K, Nakaji S, Mikami M et al (2021) Prevalence and associated characteristics of aponeurotic ptosis among a general population in Japan. Hirosaki Med J. 71:131–137
McKean-Cowdin R, Varma R, Wu J, Hays RD, Azen SP (2007) Los Angeles Latino eye study group. Severity of visual field loss and health-related quality of life. Am J Ophthalmol 143(6):1013–1023. https://doi.org/10.1016/j.ajo.2007.02.022
Sridharan GV, Tallis RC, Leatherbarrow B, Forman WM (1995) A community survey of ptosis of the eyelid and pupil size of elderly people. Age Ageing 24(1):21–24. https://doi.org/10.1093/ageing/24.1.21
Hashemi H, Khabazkhoob M, Emamian MH et al (2016) The prevalence of ptosis in an Iranian adult population. J Curr Ophthalmol. 28(3):142–145. https://doi.org/10.1016/j.joco.2016.04.005
Kim MH, Cho J, Zhao D et al (2017) Prevalence and associated factors of blepharoptosis in Korean adult population: the Korea national health and nutrition examination survey 2008–2011. Eye 31(6):940–946. https://doi.org/10.1038/eye.2017.43
Kim KK, Granick MS, Baum GA et al (2022) American society of plastic surgeons evidence-based clinical practice guideline: eyelid surgery for upper visual field improvement. Plast Reconstr Surg 150(2):419e–434e. https://doi.org/10.1097/PRS.0000000000009329
Kusunose K, Kashima S, Sata M (2023) Evaluation of the accuracy of ChatGPT in answering clinical questions on the Japanese society of hypertension guidelines. Circ J 87(7):1030–1033. https://doi.org/10.1253/circj.CJ-23-0308
Federspiel F, Mitchell R, Asokan A, Umana C, McCoy D (2023) Threats by artificial intelligence to human health and human existence. BMJ Glob Health 8(5):e010435. https://doi.org/10.1136/bmjgh-2022-010435
Shen Y, Heacock L, Elias J et al (2023) ChatGPT and other large language models are double-edged swords. Radiology 307(2):e230163. https://doi.org/10.1148/radiol.230163
Acknowledgments
No components of the present study's conception, design, execution, writing, or editing were done in any part or assisted by ChatGPT.
Funding
No financial support was obtained for this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest to disclose.
Ethical Approval
The study focused on the adaptation of publicly available clinical guidelines and did not involve human subjects or patient data. No ethical approval was required for this study.
Human and Animal Rights
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed Consent
For this type of study, informed consent is not required.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shiraishi, M., Tomioka, Y., Miyakuni, A. et al. Performance of ChatGPT in Answering Clinical Questions on the Practical Guideline of Blepharoptosis. Aesth Plast Surg (2024). https://doi.org/10.1007/s00266-024-04005-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00266-024-04005-1