A critical analysis of markers’ feedback on ethics essays and a proposal for change

This article discusses the feedback on students’ ethics essays provided by eight markers in the Faculty of Medical Sciences at Newcastle University. It highlights significant shortcomings, including failures to identify instances where students had failed to select and to conclude on ethical issues (clearly), logical errors, misunderstandings of ethical arguments made in the literature, instances of simple deference, and a lack of critical engagement with relevant literature. Markers also made a large number of linguistic errors and, on many occasions, failed to explain clearly what they meant. Some indication is given of what the cost of this might be to health care students as well as to those who are affected by the quality of their ethical decisions. The article concludes by providing some guidance on how this cost might be reduced at Newcastle University as well as in many other institutions where similar problems are likely to exist.


Introduction
This article analyses the feedback provided by eight markers on ethics essays written by students enrolled in two modules provided by Newcastle University. The rationale for this paper stems from my dissatisfaction with the feedback that has been provided on student essays in ethics and from a desire to drive positive change. This rationale is shared by many scholars who are interested in the teaching of bioethics beyond my own university. In a review of medical ethics teaching in UK medical schools, for example, Brooks and Bell (2017, p. 606) have suggested that current assessments are inadequate, with the result that "tomorrow's patients will be treated by doctors who are inadequately prepared for ethical decision making".
To substantiate these claims and to drive positive change, a careful analysis of markers' feedback is required. By highlighting deficiences and by making suggestions for improvement, it is my hope that those involved with the marking of ethics, and the provision of written feedback in academic settings more generally, will be empowered to develop their skills in providing feedback. Whilst students benefit from highlighting what they did well to reinforce their learning, my primary concern lies with the detrimental impact upon student development caused by markers who fail to make suggestions for improvement. If no constructive criticism is provided to students about where they might improve their writing, it might foster the illusion that they did well, thus reinforcing problematic reasoning and the production of shoddy work.
My focus in this article is, therefore, on the identification of situations where markers fail to provide constructive criticism to students. Newcastle University's Research Ethics Committee approved of this study. Nine markers who contribute in 2017-2018 to the marking of two postgraduate modules, ONC 8008 Ethical Dimensions of Cancer/Palliative Care and MMB 8100 Research Skills and Principles for the Biosciences, were asked to consent to the use of their feedback in research. All apart from the marker with the smallest marking load consented. Quotations from students' essays were only included where students consented. To avoid selection bias, essays were chosen randomly by using a random number generating programme. Only one essay has been scrutinised for each marker.

A detailed analysis of eight markers' feedback on ethics essays
For ONC 8008 Ethical Dimensions of Cancer/Palliative Care, students were instructed to select a case related to the provision of health care, to identify the ethical issues raised by the case, to outline their reasons for choosing the option that they would adopt and for rejecting the alternative options, and to draw on relevant literature to support their argument. Instructions could have been clearer in that students might have been asked to choose options, rather than an option, as they were asked to identify more than one ethical issue raised by a case.
A particular student received a mark of 74% and was provided with the following feedback: "This is a well-planned and well-presented piece of fluent writing. You analyse the issues in this case very effectively. I would have liked to see more use of explicit ethical language, but you have used many of the appropriate arguments. You could have said more about what would have been involved in a best interests decision. You have drawn from a range of relevant literature and used it to good effect." It is unclear what is meant by the "use of explicit ethical language", but from my extensive experience of reading this marker's comments, he tends to like essays that explicitly mention the four principles of medical ethics, also known as "the Georgetown mantra" (Beauchamp and Childress 2001). If "many of the appropriate arguments" had been used, the question must be asked, however, what might have been gained from the use of particular words. The only other suggestion for improvement was to say more about the making of a "best interests" decision. To phrase this issue more accurately, the marker might have asked the student to say more about how health care professionals ought to make decisions about what is in the best interests of patients. No suggestions for improvement were made about what the student did write.
One issue is with the logic. The student made several logical errors, for example distinguishing two options whereas they were in fact the same option, concluding from the fact that X was a possible outcome that it was a probable outcome, and writing that a general finding about "cancer" could not be "generalised to all cancers". The student could have been made aware of these particular instances of illogicality. The student also invoked two studies that purported that older people might feel that they are a burden on health care systems, but she dismissed the studies as "merely social postulations" and "unreliable". This raises two questions: firstly, if they are "unreliable", why did the student mention them; and secondly, why did the student not provide more reliable evidence that some older people feel that they are a burden, given that it is not hard to find such evidence directly from older people themselves (see e.g. McPherson et al. 2007)? The student also reported a study that claimed that "patients fear lack of control over the end of their life" as "advanced decisions … may be ignored", but discredited this view as it was based on the views of physicians. This raises similar questions. In relation to the question whether this perception is also shared by patients, the student could have referred to research on patients' own views that suggests that this fear, and the associated "narrow concern with autonomy", is not prominent (Seymour et al. 2004, p. 65).
The student also included a section on "capacity" in which she questioned whether the patient might have had capacity at the time the "advance care directive" had been written up. This is a legitimate question that poses an important dilemma for health professionals. However, the section failed to discuss this issue in a logical manner as the student also discussed the following issues in the same paragraph: whether the fact that the advance care directive had been written five years before the clinical case should matter; whether health care professionals inhibit autonomy by asking closed questions; whether they are reluctant to start palliative care; and whether discussions about advance care decisions might help patients' relatives.
The student also included a section on "futility", but did not define the term, and included into the discussion some sentences about whether younger people should be given priority over older people, without revealing how it connects either with the case or with the concept of futility. The conclusion also referred to "studies" where it was unclear what these were, and failed to select the option that the student preferred for the issues that they had selected. Minor points are that the student omitted bibliographical data, for example authors' names, did not provide page numbers for quotations, and made several grammatical errors that impaired the clarity of their ideas.
The seven other students whose markers' feedback is analysed in the remainder of this section took the MMB 8100 Research Skills and Principles for the Biosciences module and were asked to write an essay about a fictitious scenario that involved a specialist in reproductive medicine (Rachel) and a researcher working in the field of mitochondrial disease (David). In a nutshell, the scenario involves David offering "to pay" £ 500 per treatment cycle to any woman aged between 21 and 35 who is willing to provide eggs as he needs more eggs for his research, which aims to help women with defective mitochondria to have a child who would not be affected by mitochondrial disease. Women are recruited through Rachel. Karen, a local woman, responds to Rachel's advert. Rachel explains a number of arguments that have been used in support of the legal position on embryo research in England when Karen queries this. Karen states that she does not really understand the procedure, but that she would like to participate. Rachel offers her a consent form, which is signed by Karen. Students were asked to identify three issues, to provide arguments for and against the different options, to select a course of action to resolve each of the dilemmas, and to discuss any English legislation that might be relevant in relation to their decisions. Students were also instructed to use any sources that contained the arguments that they considered the strongest.
One student received a mark of 65% and was provided with the following feedback: "Perhaps could have considered more explicitly the issue of potentiality in relation to embryos along with a utilitarian viewpoint on their use. MMSE probably isnt a useful assesment of capacity and there are more formal measures used. One issue you might have considered would have been downstream psychological effects on the child in relation to being labelled a three parent baby but this is a minor point." The problem with this feedback is that it is not clear what "the issue of potentiality" refers to. Presumably, it refers to the argument from potentiality that has been debated in the context of the ethics of embryo research (Deckers 2005). This argument, however, is only one argument that has been used in the context of discussing the ethics of embyo research. There are many other arguments that could, and in fact were discussed by the student in relation to this topic. The examiner should therefore explain why this particular argument is crucial (which it is not), and how it hangs together with "a utilitarian viewpoint". There is no reason why this argument would have a greater connection with the use of embryos compared to the other arguments that the student wrote about, notably the arguments from sentience and the argument from twinning (see e.g. Deckers 2007).
The second issue that the student identified is that "of whether Karen's autonomy has been disregarded", which is an empirical, rather than a moral issue. In relation to the moral issue of whether it is appropriate to rely on the "consent" from someone who lacks capacity, the student made a vague claim (without providing a reference) that there "are now a variety of psychiatric tests which can give an idea of an individual's mental capacity, such as the mini mental state examinations" (MMSE) and that Rachel should "use one of these tests to see if Karen has the competence to make such a decision". The marker takes issue with this, but should mention why "MMSE" might not be appropriate, what "more formal measures" he had in mind, and why these would be better to assess the capacity of a research participant. Importantly, the marker did not mention that the student did not engage with the Mental Capacity Act 2005 in relation to how to assess capacity.
If the marker's final point is minor, which it may well be, the question must be raised why the student only scored 65%. Essentially, the student is left with the suggestion that there are 3 things that they could have considered writing about. Nothing is said about what the student actually did write. In relation to the third issue, the student actually did not identify the issue clearly, at least not initially, and spent some time discussing whether mitchondrial replacement therapy (a technique used in IVF where some or all of the egg cell's or the embryo's mitochondrial DNA is replaced by the mitochondrial DNA from another egg cell or embryo) might be classed as germ-line therapy, where the scientific issues obscured the moral issues. The student also used an inaccurate and inconsistent referencing system, and failed to provide a conclusion for two of the three issues.
Another student scored 51% and received the following feedback: "The essay does not contain many references, just three, one act, one website and one research article. There is no balance between arguments for and against. There is an absence of arguments supporting Rachel in case of the consent or supporting the use in embryos in research. Essay is quite superficial. The student's opinion is stated before any arguments." The marker did well to point out that there were few references and that the argument was weak. However, specific examples could have been given of which arguments should have been developed and how this might have been done. Much more feedback could have been provided. The student appealed to the "Medicines for Human Use (Clinical Trials) Regulations 2004", for example, which is not relevant here as this is not a clinical trial. The student deferred to the authority of others in particular instances, for example in relation to the test of capacity in the Mental Capacity Act 2005. The student also quoted the Belmont Report without providing a reference (National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research, Department of Health, Education and Welfare 1978), and failed to distinguish clearly between payment for egg donation, the illegality of which was ignored (Human Fertilisation and Embryology Act 1990, section 12, para. 1 e), and compensation for expenses incurred, which can be legal (Human Fertilisation and Embryology Authority 2015).
The student also did not argue for the distinction they made between embryos and "humans". In relation to the issue about embryo research, the student merely mentioned several arguments without evaluating any, and misunderstood the argument from potentiality. The student could have been reminded here that they had been instructed to develop those arguments that they considered the strongest, rather than to list a number of them. The student could also have been informed that the meaning of two sentences in particular was unclear as they made several grammatical errors, and that their referencing system was inaccurate and inconsistent.
A third student scored 70% and was informed that the essay was "without any major flaws". The marker pointed out one inconsistency ("paying the women for any reasonable out-of-pocket expensive harbours no ethical problems, however paying participants is more controversial"), and referred the student to legal guidance on one issue, notably the rule that compensation up to £750, but no payment, is allowed for human egg donation. The problem with this is that pointing out these shortcomings fails to justify why the student lost 30%.
Many other deficiencies could have been pointed out. These include: the repetition of a few points about how to assess capacity; a particular misunderstanding of a particular source; a lack of clarity in relation to a point made by another author; a lack of logic in some sections (e.g. a failure to distinguish payment from compensation); the use of an inappropriate structure (the conclusion preceding the introduction of new material for one issue); a lack of engagement with the extensive literature on one particular argument (around the choice between saving a child and saving a tray of embryos) (Annas 1989); and confounding two arguments, namely the arguments from probability and potentiality in relation to the status of the human embryo (Deckers 2007). Particular examples could also have been provided of where the student deferred to others and the marker could have pointed out that the student did not refer to a primary source for an argument they mentioned. In addition, specific examples could have been provided of where the wording/phrasing undermined the clarity.
Another student scored 48% and was provided with the following feedback: "You do identify three ares of ethical concern from the case study. Some attention to structure would greatly assist the reading and it would also help you to organise your thoughts.
You haven't really done what was asked, to weigh up the strengths and weaknesses of different ethical strategies for each of the points you identify. You also need to bring out the specific ethical issues with reference to the relevant literature and also make reference to the legal and regulatory frameworks; you do this partially. Your analysis of the issues concerning the research methods (moral status of embryo?) is quite weak as is your analysis of the incentive/ inducement issue. I am not sure where the figure of £15 comes from but the HFEA suggest up to £750." The basic problem with this feedback is its vagueness. The student wrote that their second issue was not really an issue, which raises the question why the student decided to write about it. They also returned to the first issue after introducing the second issue. The student could also have been informed that the essay might be improved by: explaining which law they had in mind where they referred to "the law"; correcting a flaw in legal understanding (i.e.: the appropriate body in the Mental Capacity Act 2005 that is responsible for evaluating research is not a family member, but an ethics committee); and avoiding irrelevance (by omitting the provision of legal guidance related to what should be done with minors as it does not apply to this case). Specific examples could also have been provided of deference, lack of clarity and argumentation, and illogicality. Finally, the suggestion could have been made that they should use primary and accurate sources for particular legal texts; and that they should use an appropriate referencing system.
A fourth student received 70% and the following feedback: "This is a good essay and all the arguments are appropirately considered." This raises the question why the student lost 30%. Another student received 54% and was given this feedback: "The text has included some good points in relation to embryo research and obtaining consent appropriately. The provision of false information may be attributed to an absence of an adequate level of information and impact on the issue of consent." In relation to this second sentence, I am unsure what the marker meant. Perhaps it meant that the student wrongly concluded that false information has been provided by Rachel, given that this was not clear from the case. Therefore, the student's third dilemma was spurious as it centred around the ethics of providing potentially incorrect information. Whereas the marker did well to question the relevance of the student's third dilemma, much more could have been said about the essay. The student in question actually approached me for further feedback, which I provided. It included pointing out that they could have: identified the issues more clearly, for example by using a "whether or not" structure; avoided a particular section that did not address the question; been more precise in relation to a particular claim; improved clarity in particular phrases, for example in relation to a discussion about why one's probability of surviving beyond a certain stage should (not) matter morally; explained what they think about "the argument from ensoulment/twinning" and why they deferred to the law on how to assess a person's capacity; discussed what should be done if Karen were to lack capacity, for example by engaging with the legal requirement that (unpaid) carers or nominated third parties should be involved (Mental Capacity Act 2005, section 32); resolved inaccuracies in the referencing section; and avoided vagueness in the final paragraph by providing clear conclusions on all issues (Deckers 2007). I also argued that they had misunderstood a particular argument that had been made in the literature.
The final student received 47% and the following feedback: "Evidence of supplementary reading and serious effort to come to grips with some key issues, but the writing is too confused and unclear to merit a passing grade. As just one example, what does 'whether the risks should be involved' mean? Is there any realistic basis for the worry that 'the extra eggs may be used as commercial material by some evil people in order to gain more interests' under the applicable regulatory regime? I think I see what you are trying to get at with the discussion of non-maleficence on p. 2, in that a 'zero risk' criterion would mean many kinds of research would never happen, but the articulation of this point is unclear. etc. etc." The wording of this essay was particularly poor, which is why many ideas were unclear. The student could have been provided with additional examples of where the meaning was unclear. They might also be prompted to check their writing carefully for linguistic errors and to ask someone to proofread their future essay. The second issue, for example, was identified as "whether the risks should be involved", where the student might have meant to say "whether the risks were acceptable". In relation to this issue, a significant shortcoming is that the student did not explain their stance.
The student's third issue was "whether the donors should be informed in detail". Here, the student resolved the issue by stating that "the researcher could explain the risks in the euphemistical way", where the student could have been informed that the meaning was unclear. The student could also have been told that it was unclear which risks potential donors should be informed about, with reference to the literature. For example, they might have mentioned that the drugs that are taken to stimulate the ovaries might cause ovarian hyperstimulation syndrome, which they had mentioned in the first section of the essay. As the length of the essay was 225 words under the maximum word limit, they might also have been informed that they could have lengthened the essay, and engaged with a wider body of literature. Finally, the student could have provided information about page/paragraph numbers for their quotations.

Discussion
All eight markers failed to provide appropriate feedback to students. Failures included shortcomings in identifying: instances where students had failed to identify ethical issues (clearly); students presenting moral dilemmas as amoral choices; logical errors; which parts of the essay were developed insufficiently; which arguments made by others were misunderstood; particular instances where the student deferred uncritically to the opinions of others; relevant legal documents that students had not engaged with, which was particularly problematic where they adopted a stance that was contrary to the law; failures to decide on all issues; and instances where they had not complied with guidance on citing and referencing. Markers also made a large number of spelling and grammatical errors, shown also by the verbatim comments included above.
These failures were not new. Whilst some markers contributed to the marking for the first time, others had been doing so for a long time. Several attempts to improve the situation had been made, partly through the provision of more and more elaborate model answers as well as through discussion of particular essays in moderation meetings. In spite of these, insufficient progress had been made. A substantial part of the problem is that the marking of essays essays is carried out by a range of staff who have received no or very little background training in ethics. This problem is not unique to Newcastle University. Ten Have, for example, writes as follows about the situation at Duquesne University: "Most scholars who are teaching ethics do not have any degree in ethics, do not publish in ethics journals, and ethics is not their daily business. (…) Such a situation of educational anarchy would not be tolerated in many other academic disciplines" (ten Have 2018, p. 1). This type of situation, however, appears to be widespread. Commenting on the situation in Turkey, for example, Arda (2019, p. 86), a teacher in the Department of Medical Ethics and History of Medicine at Ankara University, writes: "Since the number of competent academicians in the field is not enough, lecturers from other fields or specialties who have interest in medical ethics or history give these courses. This embraces the problem of degrading these courses to a hobby rather than an academic specialty." It is unclear to what extent this is a problem in the UK. However, in their review of ethics teaching in UK medical schools, Brooks and Bell (2017, p. 612) reported "a reduction in the number of schools with full-time academics taking responsibility for ethics education since 2006".
At Newcastle University, the problem that I highlighted in this paper was addressed as follows: as I recently became the module leader of ONC 8008, I decided not to engage the marker whose feedback is analysed above any longer in the future. In light of the fact that the number of students who take this module is likely to continue to be small, I will be able to handle the marking load myself. For MMB 8100, however, the situation is different. More than 200 students partake in this module every year. The University has a rule that marks and feedback must be returned within 20 working days. Whilst this can be extended in some situations, the timing of the module is such that students must receive their marks and feedback fairly quickly to allow sufficient time for the writing of new essays in resits. In light of this situation, I decided to alter the assessment. In 2019-2020, students will no longer be asked to write an essay, but to complete an examination. The examination will use a combination of 'extended matching item' ('EMI') questions and short answer questions. The former are easy to mark as students merely have to pick the correct answer from a list of options. The latter are more difficult to mark, and will have to be double-marked, as stipulated by examination rules. Attempts will be made to select questions where answers are relatively unambiguous, and the marking scheme will be relatively easy in that students are rewarded for the production of factual information. Questions will involve asking the students to identify correct legislation and moral arguments that have been used in the literature.
This change comes at a significant cost to the students: they will no longer be provided with any feedback on how they analyse and evaluate ethical arguments that they identified from the literature on specific topics that they were prompted to think about by means of a case. Instead, they will be rewarded for their capacity to memorise and understand relevant legislation and arguments developed in the lectures. As students are no longer stimulated to resolve ethical issues that they may not have thought about before, their capacities for independent thinking are no longer rewarded. They will no longer receive any feedback on their writing skills. Particularly those students who provide simplistic solutions to complex moral problems will miss out on opportunities to be provided with feedback that questions their assumptions and points them to relevant literature that might help them to reject the counterarguments that could be raised against their positions. Students will no longer be stimulated to synthesise knowledge from multiple sources and to evaluate divergent opinions in order to take a stand on particular issues, which Bloom et al. (1956) considered to be the highest levels of learning in their taxonomy (see also e.g. Rentmeester 2018). Students are therefore also likely to alter their learning strategies. As Myser et al. (1995) have argued that "when students know that their assessment will involve the application of clinical ethical reasoning to the management of actual cases they are more likely to structure their learning to acquire those skills", the reverse might be expected where they are no longer stimulated to apply their knowledge to new scenarios. The upshot is what Martin (2016, p. 93) has called "normative apathy about the moral scope and limits of education" and a concomitant devaluation of the student "as a person worthy of moral respect".
In order to remove this cost and restore the original type of assessment, the following actions must be taken. Firstly, some staff who have been tasked specifically with the teaching of ethics, but who have consistently failed to provide appropriate feedback, should be fired where their feedback deviates too significantly from what is acceptable. To facilitate their dismissal, extensive advice should be sought from moral philosophers with expertise in bioethics who are external to the University, to avoid potential conflicts of interests. Secondly, other staff whose marking shows significant deficiencies should be provided with greater training opportunities to develop their skills. At the moment, such opportunities are limited, particularly since many staff who contribute to the marking are not specialised in ethics and are either unwilling or unable to commit fully because of their significant commitments elsewhere. Whilst some of this training could be provided internally, most of it should probably be provided by someone who is external to the Faculty or to the University to reduce the possibility that those with idiosyncratic views on what good marking consists of might exert coercive influence over others with different views. Thirdly, new staff should be hired. In light of the profound deficiencies that have been pointed out above, the selection of this staff should be carried out very carefully. Significant attention must be paid to staff's abilities to write essays themselves, which should be assessed primarily by evaluating their publication records.

Conclusion
This article has presented detailed evidence that the feedback that has been provided by staff at Newcastle University on ethics essays for two modules shows significant deficiencies. This has been a source of frustration for some students, as well as for some staff members. The existence and persistence of this situation has been facilitated by the fact that there is little expertise in ethics amongst staff. Some evidence has been provided that this is a widespread phenomenon, which has also been widely criticised, for example by ten Have who has argued rightly that the teaching of ethics ought not to be "an amateur hobby" (ten Have 2018, p. 2). In order to address the problem caused by amateurism, Newcastle University must fire staff who are incapable, train staff with limited capabilities, and hire new staff. In the short term, complex assessment methods must be replaced with more simple methods. Whilst this simplicity comes at a significant cost to students, who are trained insufficiently to become independent deliberators, and whose autonomy is therefore jeopardised (Gracia 2016), the provision of adequate feedback for simple assessment tasks may be preferable to the provision of inadequate feedback for more complex assessment methods.