Background

For a long time, bioethical research has been primarily concerned with the theoretical reflection of normative questions. This paradigm is based on a sharp separation of normative from descriptive ethics, which are fundamentally different in their epistemological interest as well as their employed methods. In the past two decades however, this separation has been increasingly overcome and a growing branch of empirical research in bioethics emerged. This change has been described as the empirical turn in bioethics [1]. Multiple authors presented evidence for the increase of empirical studies investigating ethical topics [2,3,4]. The most recent one showed that in a sample of nine bioethical journals, 18% (n = 1007) of the original papers collected and analyzed empirical data [3].

Empirical research can contribute to bioethics in several ways and several categorizations for these ways have been suggested, including studies on beliefs, perspectives, (new) issues, facts relevant to normative arguments, likely consequences or effectiveness [5,6,7].

All these different ways can become relevant for bioethics in at least two ways (see Fig. 1). First, empirical research aims to inform the development or refinement of ethical recommendations. It can do so by informing about beliefs, perspectives, facts relevant to normative arguments, likely consequences or new ethical concerns [5,6,7]. For example, attitudes research such as surveys with different stakeholder groups might reveal important viewpoints on biobanking issues that should be acknowledged in developing practice oriented guidance [8]. Another example is the assessment of clinician’s experiences and views about ethical issues encountered in clinical practice to develop a clinical ethics support system [9].

Fig. 1
figure 1

Two types of empirical research aiming to inform ethical practice

A second complementary set of empirical studies aims to evaluate how effectively, efficiently, or valid the ethical recommendations are translated into practice [5,6,7]. For example, studies assessing whether and how clinical trials are prospectively registered and how they report their results clarifies the implementation of two ethical recommendations (principles 35 and 36) included in the Declaration of Helsinki [10,11,12]. Likewise, an empirical study assessing the understanding of informed consent materials provides empirical information that helps to understand whether the ethical recommendations of informed consent are implemented in a valid way [13,14,15]. More examples of this set of empirical studies are presented in Table 1 of the protocol.

Table 1 Definitions

In many areas that “scientifically” develop recommendations on how to act in certain situations (e.g. medical treatment recommendations) the recommendations are consecutively evaluated for how effectively and efficiently they can be and are implemented [16]. In the field of bioethics, however, multiple authors criticized a neglect of empirical research that evaluates the consequences of normative recommendations [6, 17, 18]. A recent review of practice evaluation of biobank ethics and governance conducted by our working group showed the need for more practice evaluation in the area of normative biobank governance [19].

The field of empirical studies evaluating how ethical recommendations are translated into practice has not been systematically investigated yet. To inform the methodological discourse of this field the aim of this cross-sectional study is a threefold mapping: First, to understand the current scope of this field we aim to map the quantitative proportions of evaluative empirical research published in leading bioethics journals (Journal of Medical Ethics, Nursing Ethics, AJOB Empirical Bioethics and the BMC Medical Ethics). Second, to understand the scope of overarching research objectives we want to map how often the evaluation object of empirical evaluative studies reflects either broad (aspirational) norms, specific norms, or best practices; a typology suggested by Sisk and colleagues [24]. Third, to understand the qualitative spectrum of the more specific research objectives of this field we aim to inductively map the specific objects of evaluation (the types of ethical recommendations).

Methods

The methodology used was described in a study protocol that was registered publicly at the Open Science Framework (https://osf.io/r6h4y/).

Sample size

The sample size was determined with the goal to reach thematic saturation for the assessment of the more specific study objectives. Former reviews on ethical issues and policies showed that the analysis of approximately hundred papers provides a qualitatively rich account of information that allows to reach thematic saturation for major categories [20]. However, this can only be estimated a priori as it is contingent on the categorization process.

Search

To identify the quantitative proportion and methodological characteristics of evaluative empirical bioethical research in contrast to non-evaluative empirical research (see Fig. 1) a set of peer reviewed bioethics journals was used as a data source. We included the Journal of Medical Ethics and Nursing Ethics because a recent review by Wangmo et al. had shown that they publish the highest proportion of empirical research of all included journals [3]. Additionally, we included AJOB Empirical Bioethics and BMC Medical Ethics which were not part of the review by Wangmo et al. but also address empirical research in the fields of bioethics. To identify the latest 400 publications in these four journals (100 articles per journal) we used the journal categories from Pubmed by searching for the name of the journal and adding the term [Journal]. All hits were sorted by date and the first 100 (for each search) were downloaded as an XML file including the title and abstract.

Eligibility screening

All downloaded studies were screened for exclusion based on title and abstract. The screening process was performed independently in a blinded standardized manner by 2 reviewers (JS and HL) and disagreements between reviewers were resolved by consensus. Interrater agreement was measured on a random subsample of n = 50 using Cohen’s kappa with sufficient reliability defined as κ > 0.8 (i.e. “very high” agreement) [21, 22]. To screen the title and abstract we used the open-source software Rayyan [23]. We excluded all publications that reported (1) purely theoretical studies, (2) literature reviews, (3) case studies, (4) no original researches, (5) duplicates. All other studies were included in the following step of the analysis.

Categorization of included publication

All included publications were categorized in three categories: (1) evaluative empirical research, (2) non-evaluative empirical research and (3) borderline cases according to our predefined definitions (Table 1). The categorization was first conducted on the basis of the abstract and title of the publication. For all publications categorized as “evaluative empirical research” or “borderline cases” the full-text was retrieved and the categorization process was repeated based on the complete manuscript.

Data extraction

For all publications categorized as evaluative empirical research we analyzed which objects (norms and recommendations) had been evaluated. Building on a publication from Sisk and colleagues we deductively grouped the objects under three categories [24]. The authors argue for an “implementation mindset” that ethicists should adopt when translating ethical norms from abstract normative claims to concrete changes in practice. The framework they introduce is composed of four sequential processes (i.e. (1) Normative Ethics, (2) Applied Ethics, (3) Intervention and (4) Dissemination Policy). The respective results from these sequential processes are then categorized in three levels: (1) Aspirational Norms, (2) Specific Norms, and (3) Best Practice. We chose this framework because it offers a compelling and clear structure for the translational process of ethical norms. We further analyzed which evaluative approaches have been used. All information were extracted as defined in the protocol (https://osf.io/r6h4y/).

The quality of the included studies was not assessed as this is not relevant for answering our research questions about scope and objectives.

Analysis and statistics

The extracted data was analyzed by creating a data-driven coding frame as part of a qualitative content analysis [25]. For the qualitative content analysis the software MaxQDA (2020) was used [26]. To calculate summary measures (e.g. number of evaluative and non-evaluative studies) we used Microsoft® Excel for Mac version 16.61 (22,050,700).

Result

Search, inclusion and data extraction

As a basis for our analysis, we identified 400 publications from four bioethics journals. The screening process led to the exclusion of 166 publications, most of which were purely theoretical studies (n = 105, 26%) (see PRISMA diagram [27], Fig. 2). The screening on title and abstract level showed high interrater reliability with Cohen’s Kappa of 0.82. Publication dates ranged from 2014 to 2019 with the majority of studies published 2019 (n = 254) or 2018 (n = 91).

Fig. 2
figure 2

PRISMA Diagram of search and inclusion process

Evaluative and non-evaluative studies

The categorization process of the 234 included empirical studies resulted in 54% (n = 126) being categorized as non-evaluative empirical studies, 36% (n = 84) as evaluative empirical studies, and 10% (n = 24) as borderline cases (Fig. 2). The 84 studies categorized as evaluative were included in the further data extraction process. Figure 3 shows the proportions and number of evaluative studies per included journal.

Fig. 3
figure 3

Number and percentage of evaluative studies per included journal

Evaluation objects

As described in the Methods section we grouped the evaluation objects from the evaluative empirical studies under the three categories: aspirational norms, specific norms, best practices. Additional file 1: Table S1 explains these categories further and gives examples from our sample. Figure 4 presents the topics studied for these three categories.

Fig. 4
figure 4

Categorization of evaluative empirical studies in bioethics

In five (6%) of all 84 evaluative empirical studies the evaluation object reflected very broadly formulated “aspirational norms” such as “protect patient rights” or “resolve ethical conflicts”. In 14 studies (16%) more “specific norms” were evaluated such as “reduce moral distress”, “improve ethical decision making”, or “avoid therapeutic misconception”. The majority of 65 studies (77%) evaluated concrete “best practices”.

We clustered the 65 studies evaluating best practices under five subgroups. A) “Ethical procedures” such as informed consent or study approval procedures were evaluated in 30 studies (36%), B) “ethical institutions” such as ethics consultation services or ethics committees were evaluated in 15 studies (18%), “clinical and research practices” were evaluated in 9 studies (11%), “educational programs” evaluated in 6 studies (7%), and “legal regulation” evaluated in 5 studies (6%).

Figure 4 presents more detailed information on the categorization of evaluative empirical studies. Additional file 2: Table S2 presents the complete categorization of the evaluative empirical studies (https://doi.org/10.17605/OSF.IO/R6H4).

Discussion

This cross-sectional study aimed to map and categorize empirical studies published in bioethics journals that evaluate how ethical recommendations (practice-oriented theories) are translated into concrete decision making (practice). We do not know of any other study with a similar approach. We found that a substantial proportion (35%, n = 84) of all empirical studies included in our sample from four bioethics journals have such an evaluative objective.

This finding indicates that bioethics already has a subfield or a domain that studies the practice translation of ethical recommendations. Such as subfield would be in line with similar subfields in biomedicine or psychology where the evaluation of how effectively, efficiently or valid certain practice-oriented (treatment or prevention) recommendations are translated or implemented into practice is a field in its own. Translational research, implementation research, or health services research are prominent examples of such fields that have their own methods and terminologies. But in contrast to these subfields in medicine or psychology the evaluative studies on bioethical recommendations so far are not perceived as a field in itself. The bioethics literature so far primarily discussed the increased occurrence of empirical studies in general [2, 3], the (more or less legitimate) role of empirical studies in ethical reasoning [28] and the theory of evaluating the ethical practice [29, 30].

Our study suggests two intertwined dimensions for structuring the field of evaluative/translational empirical studies in bioethics. First, we chose a deductive approach according to distinguish three broader categories of evaluation objects [24]. Our study found that the majority of evaluative studies address the third category “best practices” but we also found studies that aimed to evaluate (broad) “aspirational norms” or “specific norms” without addressing a concrete best practice. Second, based on the 65 studies evaluating best practices we inductively developed five categories for types of best practices: c1) ethical procedures, c2) ethical institutions, c3) clinical or research practices, c4) educational programs, and c5) legal regulations.

Our study has limitations due to its explorative character and therefore both suggested dimensions for structuring the field of evaluative/translational bioethics need validation and might need refinement. For example, we used a broad definition of evaluative empirical research (see Table 1) to be sensitive in our analysis of this type of bioethical research. While we reached thematic saturation for these five categories for evaluation objects of best practices we need to stress that these categories were developed out of a sample of studies that were all published in only four bioethics journals. In these types of journals the most evaluated best practices were “informed consent”, “ethics consultation services”, and “study approval procedures”. These patterns might look different for empirical evaluative studies on ethics topics published in general medicine or specialty journals. Further studies including general medicine or specialty journals might help to validate or refine the five categories for types of best practices.

The results show a low proportion of evaluative empirical studies in the Journal of Medical Ethics compared to the three other included journals. Further research is needed to assess a connection between the amount of evaluative research and the type of journal. It was out of the scope of this explorative study to analyze the types of results that the empirical evaluative studies reported. Further research is needed to investigate to what extent evaluative studies actually help to improve the translation of ethical recommendations into practice. As introduced earlier, from a conceptual viewpoint the results of evaluation studies might address the effectiveness, efficiency or validity of the practice translations. Our study, however, did not empirically validate this conceptual distinction.

Sisk and colleagues call for ethicists to adopt an “implementation mindset” when formulating norms, and collaborate with others who have the expertise needed to implement policies and practices [24]. Evaluative empirical studies can provide the information needed to successfully translate ethical recommendations into practice and our study shows that these studies can be found at all stages in the translational process. We hope our mapping study facilitates discussions on how to further develop and assure quality of the emerging field of empirical studies on the practice translation of ethical recommendations.