Adequacy of prostate cancer prevention and screening recommendations provided by an artificial intelligence-powered large language model

Chiarelli, Giuseppe; Stephens, Alex; Finati, Marco; Cirulli, Giuseppe Ottone; Beatrici, Edoardo; Filipas, Dejan K.; Arora, Sohrab; Tinsley, Shane; Bhandari, Mahendra; Carrieri, Giuseppe; Trinh, Quoc-Dien; Briganti, Alberto; Montorsi, Francesco; Lughezzani, Giovanni; Buffi, Nicolò; Rogers, Craig; Abdollah, Firas

doi:10.1007/s11255-024-04009-5

Adequacy of prostate cancer prevention and screening recommendations provided by an artificial intelligence-powered large language model

Urology - Original Paper
Published: 02 April 2024

(2024)
Cite this article

International Urology and Nephrology Aims and scope Submit manuscript

Giuseppe Chiarelli^1,3,
Alex Stephens⁷,
Marco Finati^1,5,
Giuseppe Ottone Cirulli^1,4,
Edoardo Beatrici^2,3,
Dejan K. Filipas^2,6,
Sohrab Arora¹,
Shane Tinsley¹,
Mahendra Bhandari¹,
Giuseppe Carrieri⁵,
Quoc-Dien Trinh²,
Alberto Briganti⁴,
Francesco Montorsi⁴,
Giovanni Lughezzani³,
Nicolò Buffi³,
Craig Rogers¹ &
…
Firas Abdollah ORCID: orcid.org/0000-0003-1298-8231¹

154 Accesses
Explore all metrics

Abstract

Purpose

We aimed to assess the appropriateness of ChatGPT in providing answers related to prostate cancer (PCa) screening, comparing GPT-3.5 and GPT-4.

Methods

A committee of five reviewers designed 30 questions related to PCa screening, categorized into three difficulty levels. The questions were formulated identically for both GPTs three times, varying the prompts. Each reviewer assigned a score for accuracy, clarity, and conciseness. The readability was assessed by the Flesch Kincaid Grade (FKG) and Flesch Reading Ease (FRE). The mean scores were extracted and compared using the Wilcoxon test. We compared the readability across the three different prompts by ANOVA.

Results

In GPT-3.5 the mean score (SD) for accuracy, clarity, and conciseness was 1.5 (0.59), 1.7 (0.45), 1.7 (0.49), respectively for easy questions; 1.3 (0.67), 1.6 (0.69), 1.3 (0.65) for medium; 1.3 (0.62), 1.6 (0.56), 1.4 (0.56) for hard. In GPT-4 was 2.0 (0), 2.0 (0), 2.0 (0.14), respectively for easy questions; 1.7 (0.66), 1.8 (0.61), 1.7 (0.64) for medium; 2.0 (0.24), 1.8 (0.37), 1.9 (0.27) for hard. GPT-4 performed better for all three qualities and difficulty levels than GPT-3.5. The FKG mean for GPT-3.5 and GPT-4 answers were 12.8 (1.75) and 10.8 (1.72), respectively; the FRE for GPT-3.5 and GPT-4 was 37.3 (9.65) and 47.6 (9.88), respectively. The 2nd prompt has achieved better results in terms of clarity (all p < 0.05).

Conclusions

GPT-4 displayed superior accuracy, clarity, conciseness, and readability than GPT-3.5. Though prompts influenced the quality response in both GPTs, their impact was significant only for clarity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quality of information and appropriateness of Open AI outputs for prostate cancer

Article 16 January 2024

Analysis of the quality, accuracy, and readability of patient information on polycystic ovarian syndrome (PCOS) on the internet available in English: a cross-sectional study

Article Open access 15 May 2023

The efficacy of artificial intelligence in urology: a detailed analysis of kidney stone-related queries

Article Open access 14 March 2024

Data availability

The data supporting the findings of this study are available upon specific request to the author.

References

Stevenson FA, Kerr C, Murray E, Nazareth I (2007) Information from the Internet and the doctor-patient relationship: the patient perspective – a qualitative study. BMC Fam Pract 8(1):47. https://doi.org/10.1186/1471-2296-8-47
Article PubMed PubMed Central Google Scholar
Gualtieri LN (2009) “The doctor as the second opinion and the internet as the first,” in CHI ‘09 Extended Abstracts on Human Factors in Computing Systems, in CHI EA ‘09. New York, NY, USA: Association for Computing Machinery, 2489–2498. https://doi.org/10.1145/1520340.1520352
Murphy M (2019) “Dr Google will see you now: Search giant wants to cash in on your medical queries,” The Telegraph, Mar. 10, 2019. Accessed: Jul. 12, 2023. [Online]. Available: https://www.telegraph.co.uk/technology/2019/03/10/google-sifting-one-billion-health-questions-day/
Mesko B (2023) The ChatGPT (Generative Artificial Intelligence) Revolution has made artificial intelligence approachable for medical professionals. J Med Internet Res 25(1):e48392. https://doi.org/10.2196/48392
Article PubMed PubMed Central Google Scholar
Sarraju A, Bruemmer D, Van Iterson E, Cho L, Rodriguez F, Laffin L (2023) Appropriateness of cardiovascular disease prevention recommendations obtained from a popular online chat-based artificial intelligence model. JAMA 329(10):842–844. https://doi.org/10.1001/jama.2023.1044
Article PubMed PubMed Central Google Scholar
Haver HL, Ambinder EB, Bahl M, Oluyemi ET, Jeudy J, Yi PH (2023) Appropriateness of breast cancer prevention and screening recommendations provided by ChatGPT. Radiology 307(4):e230424. https://doi.org/10.1148/radiol.230424
Article PubMed Google Scholar
Davis R et al (2023) Evaluating the effectiveness of artificial intelligence-powered Large Language Models (LLMS) application in disseminating appropriate and readable health information in urology. J Urol. https://doi.org/10.1097/JU.0000000000003615
Article PubMed Google Scholar
Naccarato AMEP, Reis LO, Matheus WE, Ferreira U, Denardi F (2011) Barriers to prostate cancer screening: psychological aspects and descriptive variables–-is there a correlation? Aging Male 14(1):66–71. https://doi.org/10.3109/13685538.2010.522277
Article PubMed Google Scholar
Rezaee ME, Goddard B, Sverrisson EF, Seigne JD, Dagrosa LM (2019) ‘Dr Google’: trends in online interest in prostate cancer screening, diagnosis and treatment. BJU Int 124(4):629–634. https://doi.org/10.1111/bju.14846
Article PubMed Google Scholar
Daraz L et al (2019) Can patients trust online health information? A meta-narrative systematic review addressing the quality of health information on the internet. J Gen Intern Med 34(9):1884–1891. https://doi.org/10.1007/s11606-019-05109-0
Article PubMed PubMed Central Google Scholar
Van Bulck L, Moons P (2023) What if your patient switches from Dr. Google to Dr. ChatGPT? A vignette-based survey of the trustworthiness, value, and danger of ChatGPT-generated responses to health questions. Eur J Cardiovasc Nurs. https://doi.org/10.1093/eurjcn/zvad038
Article PubMed Google Scholar
“Is ChatGPT an Evidence-based Doctor? - ClinicalKey.” Accessed: Aug. 23, 2023. [Online]. Available: https://www.clinicalkey.com/#!/content/playContent/1-s2.0-S0302283823027173?returnurl=https:%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0302283823027173%3Fshowall%3Dtrue&referrer=
Yang S, Lee C-J, Beak J (2023) Social disparities in online health-related activities and social support: findings from health information national trends survey. Health Commun 38(7):1293–1304. https://doi.org/10.1080/10410236.2021.2004698
Article PubMed Google Scholar
Cacciamani GE, Collins GS, Gill IS (2023) ChatGPT: standard reporting guidelines for responsible use. Nature 618(7964):238–238. https://doi.org/10.1038/d41586-023-01853-w
Article CAS PubMed Google Scholar
Dave T, Athaluri SA, Singh S (2023) ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell 6:1169595. https://doi.org/10.3389/frai.2023.1169595
Article PubMed PubMed Central Google Scholar
Cocci A et al (2023) Quality of information and appropriateness of ChatGPT outputs for urology patients. Prostate Cancer Prostatic Dis. https://doi.org/10.1038/s41391-023-00705-y
Article PubMed Google Scholar

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

VUI Center for Outcomes Research, Analysis, and Evaluation, Henry Ford Health System, 2799 W Grand Blvd, Detroit, MI, 48202, USA
Giuseppe Chiarelli, Marco Finati, Giuseppe Ottone Cirulli, Sohrab Arora, Shane Tinsley, Mahendra Bhandari, Craig Rogers & Firas Abdollah
Division of Urological Surgery and Center for Surgery and Public Health, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Edoardo Beatrici, Dejan K. Filipas & Quoc-Dien Trinh
Department of Urology, IRCCS Humanitas Research Hospital, Humanitas University, Milan, Italy
Giuseppe Chiarelli, Edoardo Beatrici, Giovanni Lughezzani & Nicolò Buffi
Division of Oncology, Unit of Urology, IRCCS Ospedale San Raffaele, Vita-Salute San Raffaele University, Milan, Italy
Giuseppe Ottone Cirulli, Alberto Briganti & Francesco Montorsi
Department of Urology and Renal Transplantation, University of Foggia, Foggia, Italy
Marco Finati & Giuseppe Carrieri
Department of Urology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
Dejan K. Filipas
Public Health Sciences, Henry Ford Health System, Detroit, MI, USA
Alex Stephens

Authors

Giuseppe Chiarelli
View author publications
You can also search for this author in PubMed Google Scholar
Alex Stephens
View author publications
You can also search for this author in PubMed Google Scholar
Marco Finati
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Ottone Cirulli
View author publications
You can also search for this author in PubMed Google Scholar
Edoardo Beatrici
View author publications
You can also search for this author in PubMed Google Scholar
Dejan K. Filipas
View author publications
You can also search for this author in PubMed Google Scholar
Sohrab Arora
View author publications
You can also search for this author in PubMed Google Scholar
Shane Tinsley
View author publications
You can also search for this author in PubMed Google Scholar
Mahendra Bhandari
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Carrieri
View author publications
You can also search for this author in PubMed Google Scholar
Quoc-Dien Trinh
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Briganti
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Montorsi
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Lughezzani
View author publications
You can also search for this author in PubMed Google Scholar
Nicolò Buffi
View author publications
You can also search for this author in PubMed Google Scholar
Craig Rogers
View author publications
You can also search for this author in PubMed Google Scholar
Firas Abdollah
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: Chiarelli Giuseppe, Abdollah, Firas. Data curation: Chiarelli, Giuseppe, Arora Sohrab. Formal analysis: Stephens, Alex. Funding acquisition: Rogers Craig, Abdollah Firas. Investigation: Chiarelli Giuseppe, Cirulli Giuseppe Ottone, Finati Marco, Beatrici Edoardo, Dejan Filipas, Tinsley Shane, Arora Sohrab. Methodology: Chiarelli Giuseppe, Stephens Alex, Abdollah, Firas. Project administration: Abdollah Firas. Supervision: Bhandari Mahendra, Trinh Quoc-Dien, Carrieri Giuseppe, Briganti Alberto, Montorsi Francesco, Lughezzani Giovanni, Buffi Nicolò. Validation: Abdollah Firas. Visualization: Chiarelli Giuseppe. Writing–original draft: Chiarelli Giuseppe. Writing–review and editing: Chiarelli Giuseppe, Abdollah Firas.

Corresponding author

Correspondence to Firas Abdollah.

Ethics declarations

Conflict of interest

The authors have no financial or proprietary interests in any material discussed in this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 98 KB)

Supplementary file2 (DOCX 107 KB)

Supplementary file3 (DOCX 114 KB)

Supplementary file4 (DOCX 28 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chiarelli, G., Stephens, A., Finati, M. et al. Adequacy of prostate cancer prevention and screening recommendations provided by an artificial intelligence-powered large language model. Int Urol Nephrol (2024). https://doi.org/10.1007/s11255-024-04009-5

Download citation

Received: 15 January 2024
Accepted: 26 February 2024
Published: 02 April 2024
DOI: https://doi.org/10.1007/s11255-024-04009-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adequacy of prostate cancer prevention and screening recommendations provided by an artificial intelligence-powered large language model