We appreciate the detailed feedback on our study, which addresses the opportunities and risks in the use of LLMs by patients in medicine.

It is formulated as a weakness of our study that only one LLM was investigated without a comparison with other language models and evaluate the generalization of the results as limited.

In particular, if no human control or verification takes place, a chatbot could provide a false reference, so ethical challenges are seen.

We would like to address these comments in a statement.

We used ChatGPT—if not the only one—but the most popular language model available at the time of study implementation in February 2023, and also accessible free of charge. Physicians experienced in spine surgery asked questions about the clinical picture of lumbar disc herniation according to their clinical experience from talking to patients, i.e., from the patients' point of view.

Thus, the aim of this paper was not to survey a universally valid comparison of LLM in terms of medical information in general. Rather, the paper should serve to critically point out to physicians the possibilities, but also the dangers of an emergent, new technology. However, since patients may only obtain information from one source, they will certainly quickly be informed quite extensively, but also in a problematic way not reliably correctly.

The knowledge about this and the resulting possibilities as well as dangers, that patients will certainly use these information channels for their self-information in the future, is of great importance for physicians.

Therefore, and here we completely agree, the further scientific evaluation of ideally all available chatbot systems with regard to medical questions in general is of great importance, as it entails immediate ethical implications.