Skip to main content
Log in

Adequacy of prostate cancer prevention and screening recommendations provided by an artificial intelligence-powered large language model

  • Urology - Original Paper
  • Published:
International Urology and Nephrology Aims and scope Submit manuscript

Abstract

Purpose

We aimed to assess the appropriateness of ChatGPT in providing answers related to prostate cancer (PCa) screening, comparing GPT-3.5 and GPT-4.

Methods

A committee of five reviewers designed 30 questions related to PCa screening, categorized into three difficulty levels. The questions were formulated identically for both GPTs three times, varying the prompts. Each reviewer assigned a score for accuracy, clarity, and conciseness. The readability was assessed by the Flesch Kincaid Grade (FKG) and Flesch Reading Ease (FRE). The mean scores were extracted and compared using the Wilcoxon test. We compared the readability across the three different prompts by ANOVA.

Results

In GPT-3.5 the mean score (SD) for accuracy, clarity, and conciseness was 1.5 (0.59), 1.7 (0.45), 1.7 (0.49), respectively for easy questions; 1.3 (0.67), 1.6 (0.69), 1.3 (0.65) for medium; 1.3 (0.62), 1.6 (0.56), 1.4 (0.56) for hard. In GPT-4 was 2.0 (0), 2.0 (0), 2.0 (0.14), respectively for easy questions; 1.7 (0.66), 1.8 (0.61), 1.7 (0.64) for medium; 2.0 (0.24), 1.8 (0.37), 1.9 (0.27) for hard. GPT-4 performed better for all three qualities and difficulty levels than GPT-3.5. The FKG mean for GPT-3.5 and GPT-4 answers were 12.8 (1.75) and 10.8 (1.72), respectively; the FRE for GPT-3.5 and GPT-4 was 37.3 (9.65) and 47.6 (9.88), respectively. The 2nd prompt has achieved better results in terms of clarity (all p < 0.05).

Conclusions

GPT-4 displayed superior accuracy, clarity, conciseness, and readability than GPT-3.5. Though prompts influenced the quality response in both GPTs, their impact was significant only for clarity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Data availability

The data supporting the findings of this study are available upon specific request to the author.

References

  1. Stevenson FA, Kerr C, Murray E, Nazareth I (2007) Information from the Internet and the doctor-patient relationship: the patient perspective – a qualitative study. BMC Fam Pract 8(1):47. https://doi.org/10.1186/1471-2296-8-47

    Article  PubMed  PubMed Central  Google Scholar 

  2. Gualtieri LN (2009) “The doctor as the second opinion and the internet as the first,” in CHI ‘09 Extended Abstracts on Human Factors in Computing Systems, in CHI EA ‘09. New York, NY, USA: Association for Computing Machinery, 2489–2498. https://doi.org/10.1145/1520340.1520352

  3. Murphy M (2019) “Dr Google will see you now: Search giant wants to cash in on your medical queries,” The Telegraph, Mar. 10, 2019. Accessed: Jul. 12, 2023. [Online]. Available: https://www.telegraph.co.uk/technology/2019/03/10/google-sifting-one-billion-health-questions-day/

  4. Mesko B (2023) The ChatGPT (Generative Artificial Intelligence) Revolution has made artificial intelligence approachable for medical professionals. J Med Internet Res 25(1):e48392. https://doi.org/10.2196/48392

    Article  PubMed  PubMed Central  Google Scholar 

  5. Sarraju A, Bruemmer D, Van Iterson E, Cho L, Rodriguez F, Laffin L (2023) Appropriateness of cardiovascular disease prevention recommendations obtained from a popular online chat-based artificial intelligence model. JAMA 329(10):842–844. https://doi.org/10.1001/jama.2023.1044

    Article  PubMed  PubMed Central  Google Scholar 

  6. Haver HL, Ambinder EB, Bahl M, Oluyemi ET, Jeudy J, Yi PH (2023) Appropriateness of breast cancer prevention and screening recommendations provided by ChatGPT. Radiology 307(4):e230424. https://doi.org/10.1148/radiol.230424

    Article  PubMed  Google Scholar 

  7. Davis R et al (2023) Evaluating the effectiveness of artificial intelligence-powered Large Language Models (LLMS) application in disseminating appropriate and readable health information in urology. J Urol. https://doi.org/10.1097/JU.0000000000003615

    Article  PubMed  Google Scholar 

  8. Naccarato AMEP, Reis LO, Matheus WE, Ferreira U, Denardi F (2011) Barriers to prostate cancer screening: psychological aspects and descriptive variables–-is there a correlation? Aging Male 14(1):66–71. https://doi.org/10.3109/13685538.2010.522277

    Article  PubMed  Google Scholar 

  9. Rezaee ME, Goddard B, Sverrisson EF, Seigne JD, Dagrosa LM (2019) ‘Dr Google’: trends in online interest in prostate cancer screening, diagnosis and treatment. BJU Int 124(4):629–634. https://doi.org/10.1111/bju.14846

    Article  PubMed  Google Scholar 

  10. Daraz L et al (2019) Can patients trust online health information? A meta-narrative systematic review addressing the quality of health information on the internet. J Gen Intern Med 34(9):1884–1891. https://doi.org/10.1007/s11606-019-05109-0

    Article  PubMed  PubMed Central  Google Scholar 

  11. Van Bulck L, Moons P (2023) What if your patient switches from Dr. Google to Dr. ChatGPT? A vignette-based survey of the trustworthiness, value, and danger of ChatGPT-generated responses to health questions. Eur J Cardiovasc Nurs. https://doi.org/10.1093/eurjcn/zvad038

    Article  PubMed  Google Scholar 

  12. “Is ChatGPT an Evidence-based Doctor? - ClinicalKey.” Accessed: Aug. 23, 2023. [Online]. Available: https://www.clinicalkey.com/#!/content/playContent/1-s2.0-S0302283823027173?returnurl=https:%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0302283823027173%3Fshowall%3Dtrue&referrer=

  13. Yang S, Lee C-J, Beak J (2023) Social disparities in online health-related activities and social support: findings from health information national trends survey. Health Commun 38(7):1293–1304. https://doi.org/10.1080/10410236.2021.2004698

    Article  PubMed  Google Scholar 

  14. Cacciamani GE, Collins GS, Gill IS (2023) ChatGPT: standard reporting guidelines for responsible use. Nature 618(7964):238–238. https://doi.org/10.1038/d41586-023-01853-w

    Article  CAS  PubMed  Google Scholar 

  15. Dave T, Athaluri SA, Singh S (2023) ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell 6:1169595. https://doi.org/10.3389/frai.2023.1169595

    Article  PubMed  PubMed Central  Google Scholar 

  16. Cocci A et al (2023) Quality of information and appropriateness of ChatGPT outputs for urology patients. Prostate Cancer Prostatic Dis. https://doi.org/10.1038/s41391-023-00705-y

    Article  PubMed  Google Scholar 

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Chiarelli Giuseppe, Abdollah, Firas. Data curation: Chiarelli, Giuseppe, Arora Sohrab. Formal analysis: Stephens, Alex. Funding acquisition: Rogers Craig, Abdollah Firas. Investigation: Chiarelli Giuseppe, Cirulli Giuseppe Ottone, Finati Marco, Beatrici Edoardo, Dejan Filipas, Tinsley Shane, Arora Sohrab. Methodology: Chiarelli Giuseppe, Stephens Alex, Abdollah, Firas. Project administration: Abdollah Firas. Supervision: Bhandari Mahendra, Trinh Quoc-Dien, Carrieri Giuseppe, Briganti Alberto, Montorsi Francesco, Lughezzani Giovanni, Buffi Nicolò. Validation: Abdollah Firas. Visualization: Chiarelli Giuseppe. Writing–original draft: Chiarelli Giuseppe. Writing–review and editing: Chiarelli Giuseppe, Abdollah Firas.

Corresponding author

Correspondence to Firas Abdollah.

Ethics declarations

Conflict of interest

The authors have no financial or proprietary interests in any material discussed in this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chiarelli, G., Stephens, A., Finati, M. et al. Adequacy of prostate cancer prevention and screening recommendations provided by an artificial intelligence-powered large language model. Int Urol Nephrol (2024). https://doi.org/10.1007/s11255-024-04009-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11255-024-04009-5

Keywords

Navigation