Key summary points
To study doctor’s degree of agreement with an artificial intelligence tool (ChatGPT) that provided answers to different problems or situations in geriatric medicine.
AbstractSection FindingsSpecialists rated ChatGPT answers lower than those residents. Answers from questions related to general or theoretical aspects obtained higher mean scores, while those related to clinical complex decisions obtained lower scores.
AbstractSection MessageChatGPT could be a good tool for generating hypotheses and ordering and articulating ideas, but it is still far from being used for medical decision-making in our context.
Abstract
Purpose
The purposes of the study was to describe the degree of agreement between geriatricians with the answers given by an AI tool (ChatGPT) in response to questions related to different areas in geriatrics, to study the differences between specialists and residents in geriatrics in terms of the degree of agreement with ChatGPT, and to analyse the mean scores obtained by areas of knowledge/domains.
Methods
An observational study was conducted involving 126 doctors from 41 geriatric medicine departments in Spain. Ten questions about geriatric medicine were posed to ChatGPT, and doctors evaluated the AI's answers using a Likert scale. Sociodemographic variables were included. Questions were categorized into five knowledge domains, and means and standard deviations were calculated for each.
Results
130 doctors answered the questionnaire. 126 doctors (69.8% women, mean age 41.4 [9.8]) were included in the final analysis. The mean score obtained by ChatGPT was 3.1/5 [0.67]. Specialists rated ChatGPT lower than residents (3.0/5 vs. 3.3/5 points, respectively, P < 0.05). By domains, ChatGPT scored better (M: 3.96; SD: 0.71) in general/theoretical questions rather than in complex decisions/end-of-life situations (M: 2.50; SD: 0.76) and answers related to diagnosis/performing of complementary tests obtained the lowest ones (M: 2.48; SD: 0.77).
Conclusion
Scores presented big variability depending on the area of knowledge. Questions related to theoretical aspects of challenges/future in geriatrics obtained better scores. When it comes to complex decision-making, appropriateness of the therapeutic efforts or decisions about diagnostic tests, professionals indicated a poorer performance. AI is likely to be incorporated into some areas of medicine, but it would still present important limitations, mainly in complex medical decision-making.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Ribera Casado JM (2020) Geriatrics in Spain 2020: Main challenges. Rev Esp Geriatr Gerontol 55(2):107–113
Soulis G, Kotovskaya Y, Bahat G, Duque S, Gouiaa R, Ekdahl AW et al (2021) Geriatric care in European countries where geriatric medicine is still emerging. Eur Geriatr Med 12(1):205–211
Kuzuya M (2019) Era of geriatric medical challenges: multimorbidity among older patients. Geriatr Gerontol Int 19:699–704
Fear K, Gleber C (2023) Shaping the future of older adult care: ChatGPT, advanced AI, and the transformation of clinical practice. JMIR Aging 13(6):e51776
Choudhury A, Renjilian E, Asan O (2020) Use of machine learning in geriatric clinical care for chronic diseases: a systematic literature review. JAMIA Open 3(3):459–471
Meltzer J, Tielemans A (2022) The European Union AI Act Next steps and issues for building international cooperation
High-Level Expert Gorup on Artificial Intelligence European Comission (2018) A definition of AI: Main capabilities and Scientific disciplines. [Internet]. https://ec.europa.eu/digital-single-market/en/high-level-expert-group-artificial-intelligence
Crossnohere NL, Elsaid M, Paskett J, Bose-Brill S, Bridges JFP (2022) Guidelines for artificial intelligence in medicine: literature review and content analysis of frameworks. J Med Internet Res 24:e36823
Alabi RO, Youssef O, Pirinen M, Elmusrati M, Mäkitie AA, Leivo I et al (2021) Machine learning in oral squamous cell carcinoma: Current status, clinical concerns and prospects for future—a systematic review. Artif Intell Med 115:102060
Cesario A, D’oria M, Calvani R, Picca A, Pietragalla A, Lorusso D et al (2021) The role of artificial intelligence in managing multimorbidity and cancer. J Pers Med 11:314
Liu J, Wang C, Liu S (2023) Utility of ChatGPT in Clinical Practice. J Med Internet Res 25:e48568
Nassif AB, Talib MA, Nasir Q, Afadar Y, Elgendy O (2022) Breast cancer detection using artificial intelligence techniques: a systematic literature review. Artif Intell Med 127:102276
Kulkarni S, Seneviratne N, Baig MS, Khan AHA (2020) Artificial intelligence in medicine: where are we now? Acad Radiol 27:62–70
DeSouza DD, Robin J, Gumus M, Yeung A (2021) Natural language processing as an emerging tool to detect late-life depression. Vol. 12, Frontiers in Psychiatry. Frontiers Media S.A.
Dai HJ, Su CH, Lee YQ, Zhang YC, Wang CK, Kuo CJ et al (2021) Deep learning-based natural language processing for screening psychiatric patients. Front Psychiatry 15:11
Karim HT, Vahia I V., Iaboni A, Lee EE (2022) Editorial: artificial intelligence in geriatric mental health research and clinical care. Vol. 13, Frontiers in Psychiatry. Frontiers Media S.A.
Mayol J (2023) Transforming abdominal wall surgery with generative artificial intelligence. J Abdom Wall Surg 27:2
Puterman-Salzman L, Katz J, Bergman H, Grad R, Khanassov V, Gore G et al (2023) Artificial intelligence for detection of dementia using motion data: a scoping review. Dement Geriatr Cogn Dis Extra Internet. https://doi.org/10.1159/000533693
Haque N (2023) Artificial intelligence and geriatric medicine: New possibilities and consequences. J Am Geriatr Soc 71:2028–2031
Dave T, Athaluri SA, Singh S (2023) ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Vol. 6, Frontiers in Artificial Intelligence. Frontiers Media S.A.
Ferreira AL, Chu B, Grant-Kels JM, Ogunleye T, Lipoff JB (2023) Evaluation of ChatGPT dermatology responses to common patient queries. JMIR Dermatol [Internet]. 6:e49280. https://derma.jmir.org/2023/1/e49280
The Lancet Digital Health (2023) ChatGPT: friend or foe? Vol. 5, The Lancet Digital Health. Elsevier Ltd, p e102
Srivastav S, Chandrakar R, Gupta S, Babhulkar V, Agrawal S, Jaiswal A, et al (2023) ChatGPT in radiology: the advantages and limitations of artificial intelligence for medical imaging diagnosis. Cureus
Kameyama M, Umeda-Kameyama Y (2023) Applications of artificial intelligence in dementia. Geriatr Gerontol Int [Internet]. https://doi.org/10.1111/ggi.14709
Yeo YH, Samaan JS, Ng WH, Ting PS, Trivedi H, Vipani A et al (2023) Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellu- lar carcinoma. Clin Mol Hepatol 29(3):721–732
European Labour Authority DG for ESA and I (2023) Millennials and Gen Z in the workplace: similarities and differences. [cited 2024 Feb 13]; https://eures.europa.eu/millennials-and-gen-z-workplace-similarities-and-differences-2023-03-02_en
Potapenko I, Boberg-Ans LC, Stormly Hansen M, Klefter ON, van Dijk EHC, Subhi Y (2023) Artificial intelligence-based chatbot patient information on common retinal diseases using <scp>ChatGPT</scp>. Acta Ophthalmol 101(7):829–831
Rao A, Pang M, Kim J, Kamineni M, Lie W, Prasad AK et al (2023) Assessing the utility of ChatGPT throughout the entire clinical workflow: development and usability study. J Med Internet Res 25:e48659
Haug CJ, Drazen JM (2023) Artificial intelligence and machine learning in clinical medicine, 2023. N Engl J Med 388(13):1201–1208
Beam AL, Drazen JM, Kohane IS, Leong TY, Manrai AK, Rubin EJ (2023) Artificial intelligence in medicine. N Engl J Med [Internet] 388(13):1220–1221. https://doi.org/10.1056/NEJMe2206291
Drazen JM, Kohane IS, Leong TY, Lee P, Bubeck S, Petro J, et al (2023) Chatbot for medicine. Engl J Med 388
Huh S (2023) Are ChatGPT’s knowledge and interpretation ability comparable to those of medical students in Korea for taking a parasitology examination?: a descriptive study. J Educ Eval Health Prof 11(20):1
Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C et al (2023) Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digital Health 2(2):e0000198
Carrasco JP, García E, Sánchez DA, Porter E, De La Puente L, Navarro J, et al (2023) ¿Es capaz “ChatGPT” de aprobar el examen MIR de 2022? Implicaciones de la inteligencia artificial en la educación médica en España. Revista Española de Educación Médica [Internet]. 4(1). Available from: https://revistas.um.es/edumed/article/view/556511
Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor RA et al (2023) How does ChatGPT Perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 8(9):e45312
Fuentes-Martín Á, Cilleruelo-Ramos Á, Segura-Méndez B, Mayol J (2023) Can an artificial intelligence model pass an examination for medical specialists? Arch Bronconeumol 59:534–536
Cao Y, Zhou L, Lee S, Cabello L, Chen M, Hershcovich D (2023) Assessing Cross-Cultural Alignment between ChatGPT and Human Societies: An Empirical Study [Internet]. https://openai.com/blog/chatgpt
Becker M, Committee C, Goodrich ED The health care systems of the United States and Spain: a comparison
Avanzas PPI, MC (2015) The great challenge of the public health system in Spain [Internet]. OECD. (Health at a Glance). Available from: https://www.oecd-ilibrary.org/social-issues-migration-health/health-at-a-glance-2015_health_glance-2015-en
Buntin MB (2021) Confronting challenges in the US health care system. JAMA 325(14):1399
Lluis J, Ferré B, Oficina C (2022) de Ciencia y Tecnología del Congreso de los Diputados. Inteligencia artificial y salud. Potencial y desafíos
Zhang J, Zhang Zm (2023) Ethics and governance of trustworthy medical artificial intelligence. BMC Med Inform Decis Mak 23(1):7
Van De Sande D, Van Genderen ME, Smit JM, Huiskens J, Visser JJ, Veen RER et al (2022) Developing, implementing and governing artificial intelligence in medicine: a step-by-step approach to prevent an artificial intelligence winter, vol 29, BMJ Health and Care Informatics. BMJ Publishing Group
Khan B, Fatima H, Qureshi A, Kumar S, Hanan A, Hussain J et al (2023) Drawbacks of artificial intelligence and their potential solutions in the healthcare sector. Biomed Mater Devices. 1:731–738
Patel VL, Shortliffe EH, Stefanelli M, Szolovits P, Berthold MR, Bellazzi R et al (2009) The coming of age of artificial intelligence in medicine. Artif Intell Med 46(1):5–17
Moor M, Banerjee O, Abad ZSH, Krumholz HM, Leskovec J, Topol EJ et al (2023) Foundation models for generalist medical artificial intelligence. Nature 616(7956):259–265
Stokel-Walker C, Van Noorden R (2023) What ChatGPT and generative AI mean for science. Nature 614(7947):214–216
Ulloa Valenzuela G (2023) Desafío del uso de inteligencia artificial para la elaboración de la literatura científica: el caso de ChatGPT, un debate abierto. Cuadernos Médico Sociales [Internet]. 63(1):27–31. Available from: https://cuadernosms.cl/index.php/cms/article/view/1140
Yu P, Xu H, Hu X, Deng C (2023) Leveraging generative AI and large language models: a comprehensive roadmap for healthcare integration, vol 11, Healthcare (Switzerland). Multidisciplinary Digital Publishing Institute (MDPI)
Acknowledgements
We would like to express our gratitude to Elisabet Sánchez-García, MD, PhD, and Ester Jovell, MD, PhD, for their valuable time and help. We are also grateful to Mrs. Ares Gratal for her help in the language correction process.
Funding
No direct or indirect financial support by extramural sources was received.
Author information
Authors and Affiliations
Contributions
DR-J conceived the study and carried out the statistical analysis, data interpretation and project management. DR-J, SD and YC wrote the first draft of the manuscript. DR-J, SD and FR wrote and edited the final draft of the manuscript and bibliography. DR-J, SD, YC, LC-L, FR and ML-M were involved in the collection of data and manuscript revision. LC-L, FR and ML-M supervised the final paper. All authors reviewed and edited the manuscript and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
None.
Ethical approval
Hospital Universitari de Terrassa Clinical Research Ethical Comitee concluded that no ethical issues were found. CODE number: 02-23-175-115.
Informed consent
For this type of study, consent is not required.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rosselló-Jiménez, D., Docampo, S., Collado, Y. et al. Geriatrics and artificial intelligence in Spain (Ger-IA project): talking to ChatGPT, a nationwide survey. Eur Geriatr Med (2024). https://doi.org/10.1007/s41999-024-00970-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41999-024-00970-7