Skip to main content

Advertisement

Log in

Evaluating the performance of ChatGPT in answering questions related to urolithiasis

  • Urology - Original Paper
  • Published:
International Urology and Nephrology Aims and scope Submit manuscript

Abstract

Purpose

ChatGPT is an artificial intelligence (AI) program with natural language processing. We analyzed ChatGPT’s knowledge about urolithiasis whether it can be used to inform patients about urolithiasis.

Methods

Frequently asked questions (FAQs) about urolithiasis on the websites of urological associations and hospitals were analyzed. Also, strong recommendation-level information was gathered from the urolithiasis section of the European Association of Urology (EAU) 2022 Guidelines. All questions were asked in order in ChatGPT August 3rd version. All answers were evaluated separately by two specialist urologists and scored between 1 and 4, where 1: completely correct, 2: correct but inadequate, 3: a mix of correct and misleading information, and 4: completely incorrect.

Results

Of the FAQs, 94.6% were answered completely correctly. No question was answered completely incorrectly. All questions about general, diagnosis, and ureteral stones were graded as 1. Of the 60 questions prepared according to the EAU guideline recommendations, 50 (83.3%) were evaluated as grade 1, and 8 (13.3%) and 2 (3.3%) as grade 3. All questions related to general, diagnostic, renal calculi, ureteral calculi, and metabolic evaluation received the same answer the second time they were asked.

Conclusion

Our findings demonstrated that ChatGPT accurately and satisfactorily answered more than 95% of the questions about urolithiasis. We conclude that applying ChatGPT in urology clinics under the supervision of urologists can help patients and their families to have better understanding on urolithiasis diagnosis and treatment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

Data available on request from the authors.

References

  1. Liu Y, Chen Y, Liao B, Luo D, Wang K, Li H, Zeng G (2018) Epidemiology of urolithiasis in Asia. Asian J Urol 5(4):205–214

    Article  PubMed  PubMed Central  Google Scholar 

  2. Strohmaier WL (2016) Recent advances in understanding and managing urolithiasis. Res 5:2651

    Google Scholar 

  3. Çinar Ç, Yildirim MA, Öneş K, Gökşenoğlu G (2021) Effect of robotic-assisted gait training on functional status, walking and quality of life in complete spinal cord injury. Int J Rehabil Res 44(3):262–268

    Article  PubMed  Google Scholar 

  4. openai. ChatGPT: Optimizing Language Models for Dialogue. In; 2023

  5. Gilson A, Safranek CW, Huang T et al (2023) How does ChatGPT perform on the united states medical licensing examination? the ımplications of large language models for medical education and knowledge assessment. JMIR Med Educ. 9:e45312. https://doi.org/10.2196/45312

    Article  PubMed  PubMed Central  Google Scholar 

  6. Rao A, Kim J, Kamineni M, Pang M, Lie W, Succi MD (2023) Evaluating ChatGPT as an adjunct for radiologic decision-making. MedRxiv. 16:1351

    CAS  Google Scholar 

  7. Yeo YH, Samaan JS, Ng WH et al (2023) Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma. Clin Mol Hepatol. https://doi.org/10.3350/cmh.2023.0089

    Article  PubMed  PubMed Central  Google Scholar 

  8. EAU Guidelines. Edn. presented at the EAU Annual Congress Milan 2023. ISBN 978–94–92671–19–6.

  9. Lee P, Bubeck S, Petro J (2023) Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine. N Engl J Med 388(13):1233–1239. https://doi.org/10.1056/NEJMsr2214184

    Article  PubMed  Google Scholar 

  10. Ergul A (2021) Quality and reliability of youtube videos on surgical treatment of uterine leiomyomas. Cureus 13(11):e20044

    PubMed  PubMed Central  Google Scholar 

  11. Alsyouf M, Stokes P, Hur D, Amasyali A, Ruckle H, Hu B (2019) “Fake News” in urology: evaluating the accuracy of articles shared on social media in genitourinary malignancies. BJU Int 124(4):701–706

    Article  PubMed  Google Scholar 

  12. Van Bulck L, Moons P (2023) Response to the letter to the editor on: Dr ChatGPT in cardiovascular nursing: a deeper dive into trustworthiness value and potential risks. Eur J Cardiovasc Nurs. https://doi.org/10.1093/eurjcn/zvad049

    Article  PubMed  Google Scholar 

  13. Meo SA, Al-Masri AA, Alotaibi M, Meo MZS, Meo MOS (2023) ChatGPT knowledge evaluation in basic and clinical medical sciences: multiple choice question examination-based performance. Healthcare (Basel) 11(14):2046

    Article  PubMed  Google Scholar 

  14. Antaki F, Touma S, Milad D, El-Khoury J, Duval R (2023) Evaluating the performance of ChatGPT in ophthalmology: an analysis of its successes and shortcomings. Ophthalmol Sci 3(4):100324

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by HC, AA, and UC. The first draft of the manuscript was written by HC and all authors commented on previous versions of the manuscript. Conceptualization: AM; methodology: AM; formal analysis and investigation: OY; writing—original draft preparation: UC; writing—review and editing: AA; supervision: FO. The authors have no conflicts of interest to declare. All co-authors have seen and agree with the contents of the manuscript and there is no financial interest to report. The authors did not receive support from any organization for the submitted work. We certify that the submission is original work and is not under review at any other publication.

Corresponding author

Correspondence to Hakan Cakir.

Ethics declarations

Conflict of interest

The authors have no conflict of interest to declare. All co-authors have seen and agree with the contents of the manuscript and there is no financial interest to report. We certify that the submission is original work and is not under review at any other publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 32 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cakir, H., Caglar, U., Yildiz, O. et al. Evaluating the performance of ChatGPT in answering questions related to urolithiasis. Int Urol Nephrol 56, 17–21 (2024). https://doi.org/10.1007/s11255-023-03773-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11255-023-03773-0

Keywords

Navigation