Skip to main content

Advertisement

Log in

ChatGPT Performs on the Chinese National Medical Licensing Examination

  • Original Paper
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

ChatGPT, a language model developed by OpenAI, uses a 175 billion parameter Transformer architecture for natural language processing tasks. This study aimed to compare the knowledge and interpretation ability of ChatGPT with those of medical students in China by administering the Chinese National Medical Licensing Examination (NMLE) to both ChatGPT and medical students. We evaluated the performance of ChatGPT in three years' worth of the NMLE, which consists of four units. At the same time, the exam results were compared to those of medical students who had studied for five years at medical colleges. ChatGPT’s performance was lower than that of the medical students, and ChatGPT’s correct answer rate was related to the year in which the exam questions were released. ChatGPT’s knowledge and interpretation ability for the NMLE were not yet comparable to those of medical students in China. It is probable that these abilities will improve through deep learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Data availability

Supporting data are available upon request after publication.

References

  1. Shen Y, Heacock L, Elias J, Hentel KD, Reig B, Shih G, Moy L. ChatGPT and Other Large Language Models Are Double-edged Swords. Radiology. 2023 Jan 26:230163. https://doi.org/10.1148/radiol.230163.

  2. Biswas S. ChatGPT and the Future of Medical Writing. Radiology. Feb 2 2023 :223312. https://doi.org/10.1148/radiol.223312

  3. Wang S, Scells H, Koopman B, Zuccon G. Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search? arXiv. Preprint posted online on 3 Feb 2023. https://doi.org/10.48550/arXiv.2302.03495

  4. Guo B, Zhang X, Wang Z, Jiang M, Nie J, Ding Y, Yue J, Wu Y. How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection. arXiv. Preprint posted online on 18 Jan 2023. https://doi.org/10.48550/arXiv.2301.07597

  5. King, M.R. The Future of AI in Medicine: A Perspective from a Chatbot. Ann Biomed Eng 51, 291–295 (2023). https://doi.org/10.1007/s10439-022-03121-w

    Article  PubMed  Google Scholar 

  6. Das A, Selek S, Warner AR, Zuo X, Hu Y, Keloth VK, Li J, Zheng WJ, Xu H. 2022. Conversational Bots for Psychotherapy: A Study of Generative Transformer Models Using Domain-specific Dialogues. In Proceedings of the 21st Workshop on Biomedical Language Processing, pages 285–297, Dublin, Ireland. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.bionlp-1.27

  7. Mijwil M, Aljanabi M, Ali AH. (2023). ChatGPT: Exploring the Role of Cybersecurity in the Protection of Medical Information. Mesopotamian Journal of CyberSecurity, 2023, 18–21. https://doi.org/10.58496/MJCS/2023/004

  8. Bommarito J, Bommarito M, Katz DM, Katz J. GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities. arXiv preprint posted online on 11 Jan 2023 https://doi.org/10.48550/arXiv.2301.04408

  9. Bommarito II M, Katz DM. GPT Takes the Bar Exam. arXiv preprint posted online on 29 Dec 2022. https://doi.org/10.48550/arXiv.2212.14402

  10. Gilson A, Safranek CW, Huang T, et al. How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Med Educ. 2023 Feb 8;9:e45312. https://doi.org/10.2196/45312

  11. Huh S. Are ChatGPT’s knowledge and interpretation ability comparable to those of medical students in Korea for taking a parasitology examination?: a descriptive study. J Educ Eval Health Prof. 2023;20:1. https://doi.org/10.3352/jeehp.2023.20.1

    Article  PubMed  PubMed Central  Google Scholar 

  12. Xiancheng Wang. Experiences, challenges, and prospects of National Medical Licensing Examination in China. BMC Med Educ. 2022 May 8;22(1):349. https://doi.org/10.1186/s12909-022-03385-9

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Hacker P, Engel A, Mauer M. Regulating ChatGPT and other Large Generative AI Models. arXiv. Preprint posted online on 10 Feb 2023. https://doi.org/10.48550/arXiv.2302.02337

  14. Kung TH, Cheatham M, Medinilla A, Sillos C, De Leon L, Elepano C, et al. Performance of ChatGPT on USMLE: Potential for AIAssisted Medical Education Using Large Language Models. medRxiv 2022.12.19.22283643. https://doi.org/10.1101/2022.12.19.22283643

  15. Jeblick K, Schachtner B, Dexl J, Mittermeier A, Stüber AT, Topalis J, Weber T, Wesp P, Sabel B, Ricke J, Ingrisch M. ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports. arXiv preprint posted online on 30 Dec 2022. https://doi.org/10.48550/arXiv.2212.14882

  16. Gao CA, Howard FM, Markov NS, Dyer EC, Ramesh S, Luo Y, et al. Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers.bioRxiv 2022.12.23.521610. https://doi.org/10.1101/2022.12.23.521610

  17. Jeblick K, Schachtner B, Dexl J, Mittermeier A, Stüber AT, Topalis J, et al. ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports. arXiv preprint posted online on30 Dec 2022. https://doi.org/10.48550/arXiv.2212.14882

Download references

Funding

The work received no external funding.

Author information

Authors and Affiliations

Authors

Contributions

WXY and LXY conceived and designed the study, developed the study protocol, statistical analyses and wrote the manuscript. WGX, JJD, XY, ZJL, FQY, and SW encoded and input the data into ChatGPT. GZY and HWG performed quality control, and statistical analyses.

Corresponding author

Correspondence to Xiaoyang Li.

Ethics declarations

Ethical approval

This was not a study of human subjects, but an analysis of the results of an educational examination routinely conducted. Therefore, neither receiving approval from the institutional review board nor obtaining informed consent was required.

Competing interests

The authors declare that there is no conflflict of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Gong, Z., Wang, G. et al. ChatGPT Performs on the Chinese National Medical Licensing Examination. J Med Syst 47, 86 (2023). https://doi.org/10.1007/s10916-023-01961-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10916-023-01961-0

Keywords

Navigation