Background

Biomedical animal research (AR) involves some harm to sentient animals including distress (due to confinement, boredom, isolation, and fear), pain, and early death [1-3]. AR is said to be morally permissible because the balance of these costs (harms to the animals) and benefits (to human medical care, quality of life, and survival) is favorable [4]. It is generally assumed that the benefits are great to human medicine [5]. An awareness of the empirical costs and benefits of AR is an important issue in medicine for several reasons. Health care workers (HCW) often perform (and are expected to perform) AR, promote AR directly with trainees and indirectly as role models, and advocate for use of public funds (from granting agencies and charitable foundations) toward medical related AR.

There is a growing literature that raises concerns about the empirical practice of AR in at least two domains. First, the methodological quality of AR is often poor in both experimental design and animal welfare aspects [6-12]. AR publications rarely report the use of eligibility criteria, randomization, allocation concealment, blinding, sample size calculation, primary outcome specification, and study replication [6-10,13,14]. AR publications rarely report performing a systematic review to determine the necessity of the research project, rarely report the use of continuous monitoring of the level of anesthesia or pain control, and often do not report the use of acceptable methods of euthanasia [7,11,12]. Second, the translation rate from AR to humans has been disappointing [15-18]. Extensive AR in the fields of sepsis [19-21], stroke [22,23], spinal cord injury [24,25], traumatic brain injury [26], cancer [27,28], degenerative neurological diseases [29,30], acquired immunodeficiency syndrome [31], asthma [32], and other fields [15-18] has translated to humans in 0-5% of cases. Pharmaceutical companies have found that, of drugs that work well in AR and progress to human clinical trials, only ≤8% are found safe and effective enough for market approval [33,34]. AR to determine toxicology [17,35-37], carcinogenicity [17,37-39], and teratogenicity [17,40] of drugs or compounds is no more accurate than chance, with concordance rates between species generally <40%.

Since most AR is funded by public money through government and charitable granting agencies, it is important to know the public perception of, and the level of public support for AR. Surveys of the public find that the majority are ‘conditional acceptors’ of AR; they accept the practice because of the promise of cures and treatments for life-threatening and debilitating human diseases, so long as animal welfare is at least minimally considered and protected [41]. To our knowledge, no survey has asked for the details of this conditional acceptance of AR. In this survey we ask HCW directly what the minimal acceptable standards in AR methodology might be, and what the minimal acceptable translation rate of AR to human treatments might be. This is important in order to determine how strong the support is for the empirical practice of AR, and how AR could be improved to increase the level of support. We found that HCW have high expectations for the methodological quality of, and the translation rate to humans of findings from AR.

Methods

Questionnaire administration

All pediatricians and pediatric intensive care unit nurses and respiratory therapists (RTs) who are affiliated with one Canadian University were e-mailed the survey using an electronic, secure, survey distribution and collection system (REDCap, Research Electronic Data Capture) [42]. A cover letter stated that “we very much value your opinion on this important issue” and that the survey was anonymous and voluntary. We offered the incentive that if the response rate was at least 70% we would donate $1000 to the Against Malaria Foundation or the PICU Social Committee. Non-responders were sent the survey by e-mail at 3-week intervals for 3 additional mailings.

Questionnaire development

We followed published recommendations [43]. To generate the items for the questionnaire, we searched Medline from 1980 to 2012 for articles about the methodology and translation of AR. This was followed by collaborative creation of the background section and questions for the survey by the authors. Content and construct validation were done using a table of specifications filled out by experts including two ethics philosophy professors, and two pediatricians. Face and content validation were done by pilot testing of the survey, by non-medical, university-educated lay people (n = 9), pediatricians (n = 2), pediatric intensive care nurses (n = 2), and an ethics professor (n = 1). Each pilot test was followed by a semi-structured interview by 1 of the authors to ensure clarity, realism, validity, and ease of completion. A published clinical sensibility tool was used for the expert and pilot testing [43]. After minor modifications, the survey was approved by all the authors.

Questionnaire content

The background section stated: “In this survey, ‘animals’ means: mammals, such as mice, rats, dogs, and cats. It has been estimated that over 100 million animals are used in the world for research each year. There are many good reasons to justify AR, which is the topic of this survey. Nevertheless, some people argue that these animals are harmed in experimentation, because their welfare is worsened. In this survey, ‘harmful’ means such things as: pain, suffering (disease/injury, boredom, fear, confinement), and early death. This survey is about how AR should be performed. We value your opinion on the very important issue of the methodology of AR.”

We presented demographic questions, 15 questions that asked respondents “about the methods of AR that are commonly discussed by animal researchers”, 4 questions that asked the respondent “to consider what you think the benefits to humans are as a result of AR”, and 8 questions that asked the respondent “for your opinions about what you expect from AR paid for with public funds (for example, funding by government using tax dollars, or charitable foundations using donations).” Response choices included scales of “strongly agree, agree, undecided, disagree, strongly disagree”, “nearly always, often, sometimes, not often, almost never”, and “5-20%, 21-40%, 41-60%, 61-80%, over 80%” depending on the type of question. All the questions are shown in the Tables 1, 2, 3, and 4.

Table 1 Demographics of respondents
Table 2 Healthcare Worker expectations regarding the methodology used in Animal Research
Table 3 Healthcare Worker perception of the benefits to humans from animal research
Table 4 Healthcare worker expectations for translation to humans from animal research paid for with public funding

Ethics approval

The study was approved by the health research ethics board 2 of our university (study ID Pro00039590) and return of a survey was considered consent to participate.

Statistics

The web-based tool (REDCap) allows anonymous survey responses to be collected, and later downloaded into an SPSS database for analysis. The proportions of respondents with different answers were expressed as percentages. The responses of the two predefined groups, pediatricians and pediatric intensive care unit nurses/RTs, were compared using the Chi-square statistic, with P ≤ .05 after Bonferroni correction for multiple comparisons considered significant.

Results

Pediatricians

Demographics

Forty-eight responded, but only 44/114 (39%) gave responses to more than the demographic questions. Demographics are given in Table 1.

Expectations regarding methodology of AR

The majority of respondents agreed that: anesthetic use should be monitored during surgery (100%), pain should be monitored after this surgery even over-night (91%), and experimenters in a research study should have similar training on the procedures involved (97%) (Table 2). The majority disagreed that it is acceptable: to use less humane methods of euthanasia to reduce costs or improve results (82% or 52% respectively), to use animals when alternatives are available (73%), to do an animal experiment without a systematic literature review (100%), and to do an animal experiment using suboptimal methods (including randomization, blinding, and primary outcome specification) in order to save costs (82-93%). Only a minority of respondents agreed that failed animal models of a disease should continue to be used (30%), or that stressed animals should be used (37%). Finally, the majority agreed that guidelines consistent with these responses should be required for publicly funded AR (95%).

Perceptions of human benefits from AR

Most respondents believe that discoveries from AR sometimes or often lead to a treatment for human disease directly (77%) or indirectly (84%), and that researchers sometimes or often claim large benefits from AR (91%) (Table 3). The majority did not agree (84%) with the statement that “AR rarely produces benefits to humans.”

Expectations for translation to humans from AR paid for with public funding

The majority of respondents think that drugs tested on animals should correctly predict the following for humans at least 41% of the time: adverse reactions (69% of respondents), disease treatment (62% of respondents), carcinogenicity or teratogenicity (74% of respondents), and treatment of stroke, severe infection, cancer, brain or spinal cord injury (59% of respondents). The majority also expected that replication of AR findings in second laboratories or other strains of the animal should occur at least 61% of the time (95% and 68% of respondents respectively). The majority agreed that misleading (in terms of human benefit and/or harm) animal experiments should occur at most 40% of the time (86% of respondents). Finally, when asked to “assume drugs studied in animals accurately predict effects in humans less than 20% of the time. If this were true, it would significantly reduce your support for animal research”, 40% disagreed (Table 4).

Pediatric Intensive Care nurses and RTs

Demographics

Sixty-nine of 120 (58%) responded; 52 (75%) nurses and 16 (25%) RTs. Demographics are given in Table 1.

Expectations regarding methodology of AR

The majority of respondents agreed that: anesthetic use should be monitored during surgery (98%), pain should be monitored after this surgery even over-night (96%), and experimenters in a research study should have similar training on the procedures involved (96%) (Table 2). The majority disagreed that it is acceptable: to use less humane methods of euthanasia to reduce costs or improve results (87% or 81%), to use animals when alternatives are available (88%), to do an animal experiment without a systematic literature review (96%), and to do an animal experiment using suboptimal methods (including randomization, blinding, and primary outcome specification) in order to save costs (87-95%). Only a minority of respondents agreed that failed animal models of a disease should continue to be used (27%), or that stressed animals should be used (19%). Finally, the majority agreed that guidelines consistent with these responses should be required for publicly funded AR (91%).

Perceptions of the benefits to humans from AR

Most respondents believe that discoveries from AR sometimes or often lead to a treatment for human disease directly (84%) or indirectly (88%), and that researchers sometimes or often claim large benefits from AR (97%) (Table 3). The majority did not agree (87%) with the statement that “AR rarely produces benefits to humans.”

Expectations for translation to humans from AR paid for with public funding

The majority of respondents think that drugs tested on animals should correctly predict the following for humans at least 41% of the time: adverse reactions (85% of respondents), disease treatment (82% of respondents), carcinogenicity or teratogenicity (89% of respondents), and treatment of stroke, severe infection, cancer, brain or spinal cord injury (88% of respondents). The majority also expected that replication of AR findings in second laboratories or other strains of the animal should occur at least 61% of the time (92% and 83% of respondents respectively). The majority agreed that misleading (in terms of human benefit and/or harm) animal experiments should occur at most 40% of the time (84% of respondents). Finally, when asked to “assume drugs studied in animals accurately predict effects in humans less than 20% of the time. If this were true, it would significantly reduce your support for animal research”, only 6% disagreed (Table 4).

Differences between pediatricians versus nurses/RTs

There were few statistically significant differences. Nurses more often responded that drugs for stroke, severe infection, cancer, brain or spinal cord injury should work in humans. Nurses were more uncertain whether AR “rarely produces benefits to humans”, and would be less supportive of AR if it accurately predicted effects in humans <20% of the time.

Discussion

There are several important findings from this survey. First, most HCW respondents expect that AR is done with high methodological quality, and that costs and difficulty are not acceptable justifications for lower quality. Most expect that guidelines for AR funded with public money should be consistent with these expectations. Second, most respondents thought that there are either sometimes or often large benefits to humans from AR. Most disagreed that “AR rarely produces benefit to humans.” Third, most respondents expect that AR findings should translate to humans at least 41% of the time, with many expecting this at least 61% of the time. This includes AR findings of adverse events (toxicity), carcinogenicity and teratogenicity, and disease treatments. The majority thought misleading AR results should occur no more often than 20% of the time. If translation from AR to humans was to occur <20% of the time, most would be less supportive of AR. Finally, most respondents expect that AR findings should be reproducible between laboratories and between strains of the same species. There are important implications of these findings for public and HCW acceptance of AR (Table 5).

Table 5 Possible important implications of the findings from this survey

Previous public surveys have generally asked only whether people support AR for human benefit, and not asked people to evaluate the details of their expectations of AR. For example, the Eurobarometer asks “scientists should be allowed to experiment on animals like dogs and monkeys if this can help sort out human health problems”; in 2010, 44% of Europeans responded ‘agree’ and 37% ‘disagree’ [44]. This support for AR was linked with “greater appreciation of the contributions of science to the quality of life” and “an omnipotent vision of science” [45]. In the UK the 2012 Ipsos MORI determined that most (85%) are ‘conditional acceptors’ of AR; people accept AR “so long as it is for medical research purposes”, “for life-threatening diseases”, “so long as there is no unnecessary suffering”, or “where there is no alternative”, considering AR as a “necessary evil” for human benefit [41]. In the United States the 2011 Gallup’s Values and Beliefs survey found that when asked whether medical testing on animals is morally acceptable or morally wrong, 43% (and 54% of young adults 18-29 yr old) responded ‘morally wrong’ [46]. In a survey in Sweden including patients with rheumatoid arthritis and scientific expert members of research ethics boards, most respondents agreed to AR for at least some type of biomedical research. Support was highest for AR into “fatal diseases” (83.1%), and diseases with “insufficient treatment options” (82.1%) [47]. In a UK survey of scientists promoting AR, lay public, and animal welfarists, the support for AR (on a Likert scale of 7) was 5.33 (1.46), 3.57 (1.70), and 1.48 (0.87) respectively. Scientists and lay public supported animal use only for “medical research”, and not for dissection, personal decoration, or entertainment [48]. These surveys suggest people support AR on the understanding that it is necessary to provide significant benefit for humans with severe diseases, and is done to high ethical standards. However, none asked for the amount of detail as in our survey.

Some qualitative research also suggests there is conditional public acceptance of AR based on a utilitarian analysis of costs (to animals) and benefits (to humans) [49,50]. This conditional acceptance is usually based on the assumption that regulation has assured AR is to high animal welfare standards, of high scientific validity and merit (i.e., high quality research, leading to human benefit and cures), and that there are not alternative research methods [49-51]. Scientists understand this role of regulation as leading to societal acceptance of AR, and see regulation as legitimating AR practice [51-53]. However, our survey suggests that this trust in regulation may be misplaced, because regulation does not result in AR that meets HCW expectations for animal welfare, methodological quality, human benefit, or rates of translation to human medicine and cures (Table 5). Moreover, these studies showed that the public is far less accepting of the use of genetically modified animals in research, based on a deontological approach where this AR is seen as ‘wrong’ [49,50]. We did not ask about the common use of genetically modified animals in AR, and therefore may have underestimated HCW expectations of AR.

There are two main explanations for the poor predictive accuracy of AR for humans. First, it is possible that the poor methodological quality of AR has resulted in a biased literature that has led to many human trials based on inappropriate data. Second, it is possible that animal models are not good ‘causal analogical models’; not useful to extrapolate findings to humans because there are major causal disanalogies between species [54,55]. Animal models are based on this reasoning: when an animal model is similar to the human with respect to traits/properties a,b,c [e.g. fever, hypotension, and kidney injury in sepsis], and when the animal model is found to have property d [e.g. response to protein-C treatment], then it is inferred that the human also likely has property d. This ‘causal analogy’ assumes that there are few causal disanalogies: few properties e,f,g that are unique to either the animal or human and that interact causally with the common properties a,b,c. However, animals are evolved complex systems; they have a myriad of interacting modules at hierarchical levels of organization [56]. As a result of this complexity, animals have emergent properties [e.g. animal traits/functions, like property d] that are dependent on initial conditions [e.g. gene expression profiles, the context of the organism, like properties a,b,c, and e,f,g]. In complex systems [e.g. animals], very small differences in initial conditions [e.g. properties e,f,g specific to a species/strain] can result in dramatic differences in response to the same perturbation [e.g. drug, treatment, or disease leading to property d] [54-58]. There is much empirical data finding major causal disanalogies between animal species: differences in gene expression at baseline and in response to perturbations, and in disease susceptibilities [59-62]. Thus, complexity science suggests there may be an in principle limitation for AR to predict human responses. Our survey suggests that these competing explanations must be sorted out to determine whether translation can meet public expectations in weighing the costs and benefits of AR.

This study has several limitations. Response rates for pediatricians and nurses/RTs were 39% and 58% respectively; thus we cannot rule out biased participation in the survey. Statements presented needed to be short and concise, and this may have left out important details that would have influenced the understanding of and response to the text. The moderate sample size from one University limits the generalizability of our results. Nevertheless, this is the first survey we are aware of that asks any group not just to consider whether they support AR; rather, to consider in detail the expectations for the methodology and translation of AR. Strengths of this study include the rigorous survey development process, and the inclusion of the most common critiques of the empirical practice of AR. Future study should determine the generalizability of our results.

Conclusion

We found HCW respondents had high expectations for the methodological quality of AR, and the translation of findings from AR to human responses to drugs and disease. These expectations are far higher than the empirical data show having been achieved. This disconnect between HCW expectations of AR and the empirical reality of AR suggests that if HCW were better informed they would likely withdraw their conditional support of AR. Improved methodological quality is an achievable goal if this is prioritized by researchers, reviewers, editors, and funders. Whether methodologically optimal AR can achieve better human translation to meet HCW expectations is an open question.