Abstract
Introduction
Large language models (LLM) have revolutionized the way humans interact with artificial intelligence (AI) technology, with marked potential for applications in esthetic surgery. The present study evaluates the performance of Bard, a novel LLM, in identifying and managing postoperative patient concerns for complications following body contouring surgery.
Methods
The American Society of Plastic Surgeons’ website was queried to identify and simulate all potential postoperative complications following body contouring across different acuities and severity. Bard’s accuracy was assessed in providing a differential diagnosis, soliciting a history, suggesting a most-likely diagnosis, appropriate disposition, treatments/interventions to begin from home, and red-flag signs/symptoms indicating deterioration, or requiring urgent emergency department (ED) presentation.
Results
Twenty-two simulated body contouring complications were examined. Overall, Bard demonstrated a 59% accuracy in listing relevant diagnoses on its differentials, with a 52% incidence of incorrect or misleading diagnoses. Following history-taking, Bard demonstrated an overall accuracy of 44% in identifying the most-likely diagnosis, and a 55% accuracy in suggesting the indicated medical dispositions. Helpful treatments/interventions to begin from home were suggested with a 40% accuracy, whereas red-flag signs/symptoms, indicating deterioration, were shared with a 48% accuracy. A detailed analysis of performance, stratified according to latency of postoperative presentation (<48hours, 48hours–1month, or >1month postoperatively), and according to acuity and indicated medical disposition, is presented herein.
Conclusions
Despite promising potential of LLMs and AI in healthcare-related applications, Bard’s performance in the present study significantly falls short of accepted clinical standards, thus indicating a need for further research and development prior to adoption.
Level of Evidence IV
This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266.
Similar content being viewed by others
References
Alqahtani T, Badreldin HA, Alrashed M et al (2023) The emergent role of artificial intelligence, natural learning processing, and large language models in higher education and research. Res Social Adm Pharm. https://doi.org/10.1016/j.sapharm.2023.05.016
Eggmann F, Weiger R, Zitzmann NU, Blatz MB. (2023) Implications of large language models such as ChatGPT for dental medicine. J Esthetic Res Dent 35(7):1098–1102
Abi-Rafeh J, Xu HH, Kazan R (2023) Preservation of human creativity in plastic surgery research on ChatGPT. Aesthet Surg J. https://doi.org/10.1093/asj/sjad162
Abi-Rafeh J, Xu HH, Kazan R, Tevlin R, Furnas H (2023) Large language models and artificial intelligence: a primer for plastic surgeons on the demonstrated & potential applications, promises, and limitations of ChatGPT. Aesthet Surg J. https://doi.org/10.1093/asj/sjad260
Bassiri-Tehrani B, Cress PE (2023) Unleashing the power of ChatGPT: revolutionizing plastic surgery and beyond. Aesthet Surg J. https://doi.org/10.1093/asj/sjad135
Gilson A, Safranek CW, Huang T et al (2023) How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 9(1):e45312
Thapa S, Adhikari S (2023) ChatGPT, Bard, and large language models for biomedical Research: opportunities and pitfalls. Ann Biomed Eng. https://doi.org/10.1007/s10439-023-03284-0
Vaishya R, Misra A, Vaish A (2023) ChatGPT: Is this version good for healthcare and research? Diabetes Metab Syndr 17(4):102744
Hamet P, Tremblay J (2017) Artificial intelligence in medicine. Metabolism 1(69):S36-40. https://doi.org/10.1016/j.metabol.2017.01.011
Rahsepar AA, Tavakoli N, Kim GHJ, Hassani C, Abtin F, Bedayat A (2023) How AI responds to common lung cancer questions: ChatGPT vs google bard. Radiology 307(5):e230922. https://doi.org/10.1148/radiol.230922
Abi-Rafeh J, Hanna S, Bassiri-Tehrani B, Kazan R, Nahai F (2023) Complications following facelift and neck lift: implementation and assessment of large language model and artificial intelligence (ChatGPT) performance across 16 simulated patient presentations. Aesthetic Plast Surg. https://doi.org/10.1007/s00266-023-03538-1
American Society of Plastic Surgeons (2023) Body contouring. https://www.plasticsurgery.org/cosmetic-procedures/body-contouring
Hanson CW III, Marshall BE (2001) Artificial intelligence applications in the intensive care unit. Crit Care Med 29(2):427–435
Cheng K, He Y, Li C et al (2023) Talk with ChatGPT about the outbreak of Mpox in 2022: reflections and suggestions from AI dimensions. Annal Biomed Eng 8:1–5
Coffey R, Gupta V (2023) Meralgia paresthetica. In: StatPearls. StatPearls Publishing, Treasure Island, FL
Grossman MG, Ducey SA, Nadler SS, Levy AS (2001) Meralgia paresthetica: diagnosis and treatment. J Am Acad Orthop Surg. Sep Oct 9(5):336–344. https://doi.org/10.5435/00124635-200109000-00007
Chen W, James IB, Gusenoff JA, Rubin JP (2018) The constriction arm band deformity in brachioplasty patients: Characterization and incidence using a prospective registry. Plast Reconstr Surg 142(6):856e–861e. https://doi.org/10.1097/prs.0000000000004979
Tan SS, Goonawardene N (2017) Internet health information seeking and the patient-physician relationship: A systematic review. J Med Internet Res 19(1):e9. https://doi.org/10.2196/jmir.5729
Keifenheim KE, Teufel M, Ip J et al (2015) Teaching history taking to medical students: a systematic review. BMC Med Educ 15:159. https://doi.org/10.1186/s12909-015-0443-x
Verhiel S, Piatkowski de Grzymala A, van der Hulst R (2015) Mechanism of action, efficacy, and adverse events of calcium antagonists in hypertrophic scars and keloids: A systematic review. Dermatol Surg 41(12):1343–1350. https://doi.org/10.1097/dss.0000000000000506
Jones OT, Matin RN, van der Schaar M et al (2022) Artificial intelligence and machine learning algorithms for early detection of skin cancer in community and primary care settings: a systematic review. Lancet Digit Health. 4(6):e466–e476. https://doi.org/10.1016/s2589-7500(22)00023-1
Humar P, Asaad M, Bengur FB, Nguyen V (2023) ChatGPT is equivalent to first year plastic surgery residents: evaluation of ChatGPT on the plastic surgery in-service exam. Aesthet Surg J. https://doi.org/10.1093/asj/sjad130
National Guideline Centre (UK) (2020) NICE Evidence reviews collection. Evidence review for information and support needs: Perioperative care in adults: Evidence review A. National Institute for Health and Care Excellence (NICE), London
Park JO, Webb CE, Temple-Oberle CF (2020) Navigating women’s BIA-ALCL information needs: group seminars may offer an opportunity to empower the patient-surgeon team. Plast Reconstr Surg Glob Open 8(9):e3142. https://doi.org/10.1097/gox.0000000000003142
Marchesi A, Marcelli S, Parodi PC, Perrotta RE, Riccio M, Vaienti L (2017) Necrotizing fasciitis in aesthetic surgery: A review of the literature. Aesthetic Plast Surg 41(2):352–358. https://doi.org/10.1007/s00266-016-0754-2
Gaede FM, Ouazzani A, de Fontaine S (2008) Necrotizing fasciitis after abdominoplasty. Plast Reconstr Surg 121(1):358–359. https://doi.org/10.1097/01.prs.0000300301.95125.1d
Attaluri PK, Wirth PJ, Moura SP, Shaffrey EC, Rao VK (2023) The anatomy of a malpractice lawsuit. Aesthet Surg J Open Forum 5:008. https://doi.org/10.1093/asjof/ojad008
Gong JH, Kim DD, King VA, Mehrzad R (2023) Factors associated with court outcomes of medical malpractice litigations involving breast reductions: 1990–2020. Plast Reconstr Surg. https://doi.org/10.1097/prs.0000000000010471
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest to disclose.
Human and Animal rights
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed Consent
For this type of study, informed consent is not required.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Abi-Rafeh, J., Mroueh, V.J., Bassiri-Tehrani, B. et al. Complications Following Body Contouring: Performance Validation of Bard, a Novel AI Large Language Model, in Triaging and Managing Postoperative Patient Concerns. Aesth Plast Surg 48, 953–976 (2024). https://doi.org/10.1007/s00266-023-03819-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00266-023-03819-9