Background

In adults, single-sided deafness (SSD) can lead to a range of well-documented functional hearing difficulties, such as impaired spatial awareness [1, 2] and difficulties perceiving speech in demanding listening environments [3, 4] such as in the presence of background noise [5, 6]. The clinical management of SSD involves a range of different interventions that can broadly be categorised as either rerouting sounds from the deaf ear to the hearing ear, either via air or bone conduction, or by restoring hearing to the deaf ear via a middle ear or cochlear implant. A systematic review of studies evaluating the effectiveness of these SSD interventions identified that outcome selection has been somewhat biased towards assessing functional impairments for which measures are readily available and widely used, e.g., tests of speech perception in noise [7]. However, the difficulties that SSD imposes can also affect the individual’s psychological and social well-being [8, 9], and therefore, outcomes that assess the impact on an individual’s overall health and well-being are also relevant and potentially as important [10]. Healthcare users express uncertainty about choice of treatment options for SSD often due to a lack of clarity about their benefit [11]. A subsequent systematic review identified a total of 520 outcome domains from 96 studies that evaluated SSD interventions [12]. Generally speaking, interventions to reroute sounds (such as contralateral routing of signals (CROS) aid and bone anchored hearing aid (BAHA)) assessed the same outcome domains as restoring interventions (such as middle ear and cochlear implants). However, tinnitus-related outcomes and brain-related assessments of neural activity were almost exclusively limited to studies that evaluated the effect of cochlear implants, while rerouting studies were more concerned about the aversiveness of sounds. With regard to the choice of measurement instruments, the Speech, Spatial and Qualities of hearing scale (SSQ) [13] for example is one of 73 different instruments used to measure speech-related outcomes, but this instrument has also been utilised to assess device benefit, hearing disability, quality of hearing, and spatial hearing [12]. This lack of consistency emphasises the need to define an agreed minimum standard for what is critically important to assess in all clinical trials evaluating SSD interventions. Without such consensus, it will remain challenging to make evidence-based decisions about the relative benefits of the different treatment options.

The need for harmonisation of assessment methods across trials of SSD interventions has already been acknowledged, and recommendations for a minimum set of outcome measures were made following expert clinical professionals panel discussions at two international conferences in 2015 and 2016 [14]. These were daily device use; pure tone audiometry; free-field testing of speech perception in noise and sound localisation; the Speech, Spatial, and Qualities of hearing questionnaire [15]; the Health Utilities Index Mark 3 [16]; and if applicable, the Tinnitus Functional Index [17]. A main limitation of this previous work is that it focussed on cochlear implants as a treatment for SSD and so the expert panels comprised professionals from cochlear implantation centres. Further limitations were considering those outcome measurement instruments that were readily available in the hearing clinic (e.g., pure tone audiometry, standard audiometric and validated sentence test, binaural effect measures) and the lack of healthcare user involvement in the decision-making process. Therefore, it is unclear whether the recommended measures are assessing outcome domains that are most meaningful to healthcare users.

The present harmonisation study addressed these limitations by considering all SSD interventions and all professional experts, by deliberately focusing first on establishing what is important to measure, and by giving the healthcare users’ input equal weighting to that of the professionals [18]. It advocates consistent choice of outcomes to ensure high-quality, easily comparable trials that are concentrating on important outcomes relevant to all stakeholders involved. The purpose of the study was to define an agreed minimum standard for what is critically important to assess in all clinical trials evaluating SSD interventions. The expected impact would be to increase the potential for evidence synthesis (i.e., meta-analysis) of published results in order to generate the required evidence base for commissioning of clinical services and for informed decision making between healthcare user and professional.

The study method was informed by a growing body of existing work by a community of core outcome set developers: the Core Outcome Measures in Effectiveness Trials (COMET) initiative [18]. COMET has established minimum standards to guide the process, as well as to help users of core outcome domain sets to evaluate whether they have been developed using appropriate methodology [19]. The COMET initiative has also published a handbook to support the development of consensus-based recommendations for core outcome domains [18]. An agreed minimum standard would not restrict trial investigators from assessing additional outcomes, but rather, it would aim to reduce diversity in reported outcomes and provide a basis for comparison between trials.

Methods

The full protocol for prioritising the CROSSSD outcome domains has been published [20] in addition to the methodology for conducting the consensus meeting online [21]. The process is outlined in Fig. 1. Our core outcome domain set development process was informed by the COMET Handbook version 1.0 [18] and the Core Outcome Set-STAndards for Development (COS-STAD) [19]. Ethical approval was granted from the Nottingham 2 Research Ethics Committee (Research Ethics Committee reference 19/EM/0222, Integrated Research Application System Project ID 239750) on 06 August 2019. The study is reported according to the Standards for Reporting Qualitative Research (SRQR) [22].

Fig. 1
figure 1

Overview of the process used to develop a core outcome domain set for clinical trials investigating SSD interventions in adults

In summary, the study comprised three steps:

  • Step 1: Generating a long list of candidate outcome domains utilised to date in clinical trials assessing rerouting and restoring interventions for adults with SSD.

  • Step 2: Prioritising which of these outcome domains are critically important to measure when assessing whether an SSD intervention has worked by involving a large representative set of SSD stakeholders (healthcare users and professionals).

  • Step 3: Reaching a final consensus decision with a subset of stakeholder representatives on which outcome domains are sufficiently critically important to constitute the core outcome domain set for SSD interventions in adults.

Changes from protocol

There were no significant deviations to the published protocol [20], except the use of a web-based meeting in place of a face-to-face consensus meeting due to travel and physical distancing restrictions imposed by the coronavirus disease (COVID-19) pandemic [21]. A non-substantial category C amendment to the ethical approval was obtained to accommodate this change (Sponsor reference: 19032, minor amendment reference number: NSA01). Informed consent was obtained from those choosing to participate in the online Delphi (e-Delphi) surveys and prior to participation at the consensus meeting.

Participants

All relevant groups of stakeholders were invited to participate if they met the following inclusion criteria: (i) healthcare users with lived experience of SSD for 12 months or more, who had received or considered an SSD intervention; (ii) healthcare professionals (e.g., audiologists, ENT surgeons, neuro-otologists) with experience of assessing, diagnosing, or managing SSD in adults; (iii) clinical researchers with recent experience with SSD intervention studies; (iv) commercial representatives that worked for industry partners that developed, manufactured, or sold hearing aids or auditory implants used as SSD interventions; and (v) those employed by organisations that fund research focusing on SSD interventions. Recruitment used a purposive sampling method to engage with qualified experts who had a deep understanding of the topic. Advertisements were targeted at relevant international conferences, known professional bodies, related professional societies and charities, personal contacts of the study steering group, UK-based hearing clinics, and more generally through social media groups. Communications by charities and UK-based hearing clinics were the main routes for healthcare user recruitment. For more details, see Katiri et al., [20]. All participants were required to be at least 18 years old and able to read, understand, and complete web-based surveys in English.

e-Delphi surveys

The recruitment target was that at least 20 participants in each of the three major stakeholder groups (healthcare users, healthcare professionals, clinical researchers) would complete both rounds of the e-Delphi survey. To minimise attrition between the two survey rounds, the importance of completing both rounds of the survey was emphasised to participants in the information sheets and during Round 1. Our operational definition of completion was for at least 50% of the outcome domains to be scored. This criterion ensured that participants contributing to the decision-making were fully engaged in the process and had sufficient expertise to form personal judgements. Each round was open for only a short time (Round 1 for 10.2 weeks and Round 2 for 13.3 weeks), and Round 2 opened the day after the completion of Round 1. Response rates were monitored weekly, and both generic reminder emails and personalised email reminders were sent to late responders to minimise attrition.

A modified Delphi method (steps 1 and 2, Fig. 1) was adopted, presenting participants with a long list of candidate outcome domains, each accompanied by a plain language definition. The list of outcome domains was compiled using information derived from (i) a systematic review of the literature [12], (ii) available qualitative data [9], and (iii) discussion during a workshop that involved members of the study management team, two public research partners (i.e., people with lived experience of SSD), and an expert in patient and public involvement. This workshop group developed the plain language outcome domain definitions, which were further reviewed by the study steering group for clarity of language and description prior to use in the e-Delphi surveys. The study steering group comprised a clinical audiologist, two otolaryngologists (ENT surgeons), two clinical researchers in hearing science, and one clinical researcher in speech and hearing science who had concomitant experience in clinical audiology.

The process of identifying and systematically refining the selection of candidate outcome domains is summarised in Figure 2. A total of 433 outcome domains were extracted from studies included in the systematic review [12]. We removed 217 by excluding duplicates and grouping similar outcome domains. The remaining 216 were discussed during the workshop, during which a further 83 domains were removed. Reasons for removing these included (i) the domain was a metric (e.g., thresholds, tonotopy) (n = 17), (ii) the domain referred to the measuring instrument (e.g., transcranial attenuation, cortical change) (n = 7), (iii) the domain was too generic or vaguely defined (e.g., handicap, cognition) (n = 51), and (iv) the domain was device specific (e.g., periodontal, dental measures) (n = 8). However, three outcome domains from reviews of the literature on the broader effects of SSD were added because they were deemed relevant [9]. These were personal safety (e.g, road safety, independent living), motivation (e.g., to engage in challenging listening situations), and mood (e.g., general sense of well-being). Following high-level categorisation and further re-grouping, the long list for the e-Delphi survey comprised a total of 44 outcome domains. For ease of presentation to survey participants, these outcome domains were arranged into 10 categories: (1) factors related to the treatment being tested, (2) health-related quality of life, (3) hearing disability, (4) other effects, (5) physical effects, (6) psychological effects, (7) self, (8) sound quality, (9) spatial hearing, and (10) tinnitus. Each outcome domain had a plain language definition explaining in more detail the unique construct it encapsulated (see table, Additional file 1, for list and definitions of initial domains). Most outcome domains described benefits to healthcare users. Category 4 encapsulated any bad or unexpected thing that might happen during the time an SSD treatment is being tested in a clinical trial, i.e., an adverse event.

Fig. 2
figure 2

Summary of the steps taken in the outcome domain prioritisation process to agree a final core outcome domain set for clinical trials assessing SSD interventions in adults

The 44 outcome domains were presented in both rounds of the e-Delphi survey. At the end of Round 1, participants were invited to propose additional outcome domains. Two members of the study management team and a public research partner reviewed all proposals and five novel outcome domains (device usability, impact on learning, concern about hearing, vulnerability, and independence) were added to Round 2 with a corresponding plain language definition. Outcome domain scoring for the two Delphi rounds was managed using the DelphiManager v4.0 software tool developed and maintained by the COMET initiative at the University of Liverpool. A unique identifier was assigned to each participant linked to their email address, which allowed tracking of participant activity. Outcome domain items were presented in a random order to reduce the potential for systematic contextual effects on scoring, as observed by Brookes et al., [23].

For each outcome domain, participants were asked to consider how important it is to measure when deciding whether an SSD intervention works. Participants scored each outcome domain using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) scale of 1 to 9 [24]. Scoring therefore used a Likert-type scale with additional interpretation categories; 1–3 indicated that the outcome domain was not important in deciding whether an SSD intervention is effective, 4–6 indicated that the outcome domain was important but not critical, and 7–9 indicated that it was critically important to measure in all trials of SSD interventions. Participants were also given an unable to score option and could comment on any aspects of the scoring or outcome domains using open-text boxes.

In Round 2, participants were presented with the score they personally gave each outcome domain during Round 1 together with numerical and graphical feedback on the distribution of scores across the key stakeholder groups (healthcare users, healthcare professionals, clinical researchers, or commercial representatives) (see histograms, Additional file 2, for output from Round 2). The protocol set out to consider commercial representatives and funders collectively as one stakeholder group due to similar stakeholder opinions expected and the anticipated small number of participants [20]. However, no funders participated in Round 2. Feedback enabled participants to reflect on their scores in light of the distribution of scores from their own and the rest of the stakeholder groups and re-score each outcome domain if they choose to do so. Consensus was defined as at least 70% of the participants in all three stakeholder groups scoring 7–9 (critical to measure in all trials) and fewer than 15% in any stakeholder group scoring 1–3 (not important in deciding whether an SSD intervention is effective).

Consensus meeting

Participants who completed scoring for at least 90% of outcome domains in both rounds of the e-Delphi survey were invited to express their interest to attend the consensus meeting. A balance of 50:50 healthcare users and healthcare professionals was maintained for recruitment. A small number of individuals who had an interest in the process, but did not fit the inclusion criteria to actively participate in the consensus process, joined as observers.

The web-based consensus meeting was 7 h in duration, with three 30-min long breaks, and delivered using Microsoft Teams software. The meeting comprised semi-structured discussions led by three expert facilitators in three small groups (group A, group B, group C), together with large group discussions and voting involving all 12 participants (6 healthcare users and 6 healthcare professionals). The voting participants’ demographics and expertise can be found in Table 1. The two public research partners and the patient and public involvement expert were also present and could take part in the discussions but not vote. All participants were given equal turns to voice their opinions.

Table 1 Consensus meeting voting participants demographics and expertise

In advance of the consensus meeting, voting participants were sent a link to a short survey asking them to consider the outcome domains that had reached the criteria for inclusion in the core outcome domain set after the two rounds of the e-Delphi survey and to vote whether or not they agreed to limit the scope of the consensus meeting to discussing only those outcomes. Anonymised voting was performed using online surveys in real time during the workshop, which asked participants to vote agree, disagree, or unsure in response to questions about the inclusion or exclusion of specific outcome domains. Only outcome domains voted in by at least 70% of participants were included in the core outcome domain set. In cases where a majority vote was not achieved, outcome domains were set aside for re-discussion and re-voting. The results of voting were presented using histograms embedded in PowerPoint slides shared with all participants using Microsoft Teams. All consensus meeting discussions were recorded.

The meeting plans were revised to comply with the travel and physical distancing restrictions imposed by the UK government in 2020, in response to the COVID-19 pandemic. To minimise screen time during the web-based consensus meeting, participants were sent a pre-recorded introductory presentation in advance (see video, Additional file 3, for the introductory presentation) describing the aims of the day, and a guidance document outlining the day’s activities (see text file, Additional file 4, for the meeting plan and guidance document). Participant consent was obtained online.

Results

Participants

Of the 308 participants who completed Round 1, 92 (29.9%) were healthcare users, 148 (48.1%) were healthcare professionals, and 59 (19.2%) were clinical researchers (see Additional file 5: Table S5a, for participant numbers). Thirty-one participants (11 healthcare users, 13 healthcare professionals, 6 clinical researchers, 1 funder) rated fewer than 50% of outcome domains in Round 1 so were excluded from Round 2. Retention rate for all stakeholder groups exceeded 85%, with the exception of clinical researchers (74%) and funders (0%) (see Additional file 5: Table S5a, for participant numbers). Most of the consenting participants (n = 98, 31.8%) were in the 30–49 age range, followed by the 40–49 years group (n = 77, 25.0%). Only healthcare users (n = 14, 15.2%) were aged above 69 years (see Additional file 5: Table S5b, for participant numbers).

Participants registered from 29 different countries (Fig. 3). The majority were from the UK (n = 145, 47.1%), followed by Ireland and the US (n = 37, 12.0% from both) (see Additional file 5: Table S5c, for a detailed breakdown of participant registrations in each stakeholder group per country; and Table S5d for a detailed breakdown of their language for everyday communication).

Fig. 3
figure 3

World map illustrating the geographical distribution of all consenting participants (n = 308). The number of healthcare users and healthcare professionals for the five countries where most participants were recruited from are also listed

Of the healthcare professionals that reported their roles (n = 146), the majority were audiologists or clinical scientists in audiology (n = 107, 73.3%), or otolaryngologists/ENT surgeons (n = 11, 7.5%).

Of those healthcare users (n=84, 91.3%) who disclosed the time since their diagnosis of SSD, most had a history of SSD for 2–5 years (n=18, 21.4%), followed by 5–10 years or 10–20 years (n = 16, 19.0% in both cases) (see Additional file 5: Figure S5e, for details on the time since SSD diagnosis as disclosed by healthcare users). Almost all (n = 81, 88.0%) healthcare users disclosed the devices they had trialled or were primarily using. The majority (n = 47, 58.0%) had trialled or were using a CROS device, and a minority (n = 9, 11.1%) trialled a BAHA. One participant (1.2%) had received a cochlear implant. Twelve participants (14.8%) had trialled two devices (CROS and BAHA). Three healthcare users (3.7%) reported that they had trialled three devices, including a combination of CROS, BAHA, remote microphone technology, middle ear, or cochlear implants. When asked what interventions they considered trialling, 60 healthcare users (65.2%) provided a response. Most (n = 23, 38.3%) had considered a CROS aid, 10 (16.7%) had considered a BAHA, and one (1.7%) had considered the SoundBiteTM (Newport Beach, California). Five participants considered trialling restoring interventions, cochlear implantation (n = 3, 5.0%), and middle ear implants (n = 2, 3.3%). The remaining 18 participants (30.0%) indicated that they considered trialling a combination of two or more re-routing and/or restoring interventions including CROS, BAHA, the SoundBiteTM, middle ear implants, or cochlear implantation. Three (5.0%) participants commented that they were not aware of alternative options they had to consider.

e-Delphi surveys

Three hundred and eight participants completed e-Delphi Round 1 (see table, Additional file 2, for the ratings for the 44 outcome domains). Of those, 54 participants submitted 95 comments about potential additional outcome domains that they felt should be included in the list (see table, Additional file 6, for the list of comments provided in Round 1). Most comments (n = 43, 45.3%) were from healthcare users and healthcare professionals (n = 35, 36.8%), and the rest from clinical researchers (n = 16, 16.8%) and commercial representatives (n = 1, 1.0%).

Fifty-six of the proposed additional outcome domains were already captured by one or more of the existing domains. Twenty-four comments were rejected as they were out of scope (e.g., concerned economic factors, auditory training, access to hearing loss support groups). Following discussion and feedback from the public research partners and the study steering group, a further nine proposed additional outcome domains were rejected as less relevant. These included aspects of trial device usage, aetiology of the SSD, stigma, sound effects of the device, access to audiological services for timely device adjustment, family support, and availability of auditory implants in different healthcare systems. Three comments (ability to manage treatment option, ease of use, complexity of the device) seemed to be referring to the same concept. Hence, five additional outcome domains were included in Round 2. These were device usability, impact on learning, concern about your hearing, vulnerability, and independence. Two members of the study management team categorised and assigned plain language definitions to these five outcome domains prior to launching Round 2 of the e-Delphi survey. The plain language definitions and categories for these outcome domains can be found in Additional file 1.

Two-hundred and thirty-three participants completed Round 2 (see histograms, Additional file 2, for the ratings for the 49 outcome domains included in Round 2). A number of scores changed between Rounds 1 and 2, such that the outcome domain reached consensus at Round 2 but not at Round 1. Most of these changes were made by the clinical researcher and commercial representative groups with many comments indicating that scores were changed after reviewing the healthcare user group responses. Six outcome domains (device usage, discomfort in listening situations, group conversation in quiet, listening in reverberant conditions, physical tiredness, and spatial orientation) changed for two stakeholder groups. Nine outcome domains (adverse events, being aware of a sound, dissatisfaction with life, emotional distress, mood, motivation, personal safety, and protecting your hearing) changed for one stakeholder group. Two tinnitus-related outcome domains which reached consensus within the commercial representatives at Round 1 did not reach consensus at Round 2 when ratings from the healthcare users were considered. These were loudness and tinnitus-related brain changes. Finally, at Round 1, none of the four stakeholder groups reached consensus to include device malfunction, but all four groups did so at Round 2.

Overall, from Round 2, stakeholder scoring for 17 outcome domains (Table 2) met the consensus criterion for inclusion (see table, Additional file 1, for more details on ratings in Round 2). These were taken forward to the consensus meeting. Participants did not reach consensus on the importance of the remaining 32 outcome domains. These included adverse events which was the only harms outcome. Adverse events was scored 7–9 by 51% of healthcare users, 59% of healthcare professionals, 81% of clinical researchers, and 100% of commercial representatives. The 32 outcome domains were discussed by the study management group and the consensus meeting facilitators. A decision was taken to remove these from the consensus meeting discussions as they did not meet the pre-determined criteria for being important and critical for inclusion.

Table 2 Voting scores for the 17 outcome domains that met the criterion for inclusion. Bold font denotes the three core outcome domains. For plain language definitions of the outcome domains, see table, Additional file 1

Pre-consensus meeting survey

All 12 consensus meeting participants completed this survey. Ten (83.3%) participants agreed with the recommendation to discuss only the 17 outcome domains that met the consensus criterion for inclusion at the end of the e-Delphi survey process. Only one participant disagreed (8.3%). With regard to their personal choice of ‘top 3’ out of the 17 candidate outcome domains, the two most popular were listening in complex situations and impact on social situations, selected by five participants. Next were sound localisation, personal safety, listening effort, and group conversations in noisy social situations, selected by four participants. Physical tiredness, impact on work, and device malfunction were not chosen by any of the participants. The remaining eight outcome domains were selected by either one or two participants.

Consensus meeting

Participants first agreed to set aside the outcome domains (83.3% agreement): Physical tiredness, impact on work, and device malfunction. The remaining list of 14 outcome domains were discussed during two small group discussions and subsequently voted on. Initial votes were around whether to exclude outcome domains where a lack of consensus to include them was apparent, and subsequent votes were whether to include remaining outcome domains for inclusion in the core outcome domain set (Fig. 4). Participants agreed that three outcome domains should form the minimum standard. These were impact on social situations (100% agreement), group conversations in noisy social situations (83.3% agreement), and spatial orientation (100% agreement). Supporting comments made during the consensus discussions can be found in Table 3 and there were no comments against their inclusion. Five outcome domains (listening effort, device usage, being aware of a sound, listening in complex situations, and one-to-one conversations in general noise) required more extensive discussion among the group to hear a variety of opinions, but at voting none of these met the consensus criteria for inclusion (Fig. 4).

Fig. 4
figure 4

Outcome domain elimination process during the consensus meeting. Only outcome domains voted in by at least 70% of participants were included in the core outcome domain set

Table 3 Comments in favour of inclusion of these outcome domains, and other general discussion points, as extracted from the consensus meeting discussions

Participant feedback

The final core outcome domain set was shared with the 219 participants (64 healthcare users, 113 healthcare professionals, 36 clinical researchers, six commercial representatives) who completed both rounds of the e-Delphi survey but did not join the consensus meeting. Ninety-five (43.4%) participants responded of whom 32 (50.0%) were healthcare users, 48 (42.5%) were healthcare professionals, 14 (38.9%) were clinical researchers, and one (16.7%) was a commercial representative. Overall, 73 participants (76.8%) responded that they were very satisfied with the choice of included outcome domains, and 19 (20.0%) indicated that they were somewhat satisfied. One participant (healthcare professional) was neither satisfied nor dissatisfied. Two participants (2.1%) responded that they were very dissatisfied; one was a healthcare user that commented that the batteries are too expensive, and the device was unsuitable for their ear; and the other respondent was a healthcare professional that commented that “the outcome domains chosen is what matters most to patients”. None of the participants indicated that they were “somewhat dissatisfied” (see table, Additional file 7, for a detailed breakdown of the participant feedback).

Discussion

This study completes the first step in the development of a core outcome set for SSD interventions: reaching agreement on the what, i.e., the set of standardised outcomes to measure as a minimum in this clinical area [25]. A large group of stakeholders came to a consensus about those outcome domains that are important to measure in adults with SSD, and three outcome domains were identified by consensus as being critically important to measure in all clinical trials for SSD interventions in adults. The core outcome domain set recommends which are the most important outcome domains to measure in order to promote greater consistency across trials. The core outcome domain set development process has also provided detailed information about the importance of a broader set of outcome domains from a diverse and international group of key stakeholders that can inform the selection of primary and secondary outcomes for future trials of SSD interventions more generally.

The current work, in focusing on prioritising outcome domains, is complementary to the previous work to establish a unified testing framework for SSD as recommended by Van de Heyning et al., [14]. That previous work aimed to harmonise assessment methods for treatment options prescribed for SSD and asymmetrical hearing loss in clinical practice and was based on the set of measures that were available, familiar, commonly used in previous studies, and judged to be relevant and appropriate for this clinical population by clinical professionals in otolaryngology and audiology. This approach, focusing on available measures (the how), is a pragmatic solution to addressing heterogeneity in the selection of outcome measures that has been observed in interventional studies involving adults with SSD [7]. The CROSSSD study perspective was complementary in that its aim was to identify which outcome domains are critical to measure (the what), irrespective of whether measures with which those outcome domains could be assessed already exist or are known to the SSD research community. In doing so, it also engaged a broader group of stakeholders using a consensus methodology that incorporates views of all relevant stakeholders, including healthcare users, healthcare professionals, clinical researchers, and commercial representatives.

Perhaps reassuringly, two of the measures recommended by Van de Heyning et al., [14] (speech in noise testing, and localisation testing) appear to be closely related to two of the outcome domains included in the CROSSSD core outcome domain set (group conversations in noisy social situations, and spatial orientation). The third outcome domain in the CROSSSD core outcome domain set (impact on social situations) also related to quality of life, which was identified more generally as an outcome of interest by Van de Heyning et al. [14], whom recommended the use of the Health Utilities Index Mark 3 (HUI3) questionnaire [16]. The HUI3 is however rarely reported in published SSD intervention studies as a quality of life measure, instead the EuroQol Research Foundation (EuroQol-5D-3L) questionnaire [26], the World Health Organization Quality of Life Short Form Survey (WHOQOL-BREF) [27], and Medical Outcome Study Short Form 36 (SF-36) [28] questionnaire tend to be reported [12]. Furthermore, the suitability of the HUI3 multi-attribute score in detecting health-related quality of life changes in cochlear implant recipients has been questioned [29]. For those aged over 55 years, the authors suggested that hearing loss comorbidities and psychosocial factors compromise its sensitivity to implantation-related changes. Of the other measures recommended by the previous consensus study, device usage was discussed extensively during the CROSSSD consensus meeting but only 25% of the stakeholders voted it to be critically important to measure in all trials, and tinnitus-related outcome domains did not reach consensus at the e-Delphi stage. The former decision may reflect that the importance of Device usage as an outcome can vary depending on the nature of the devices being trialled, e.g., usage of cochlear implants during all waking hours is generally strongly encouraged to promote adaptation to the electrical stimulation [30], whereas effective use of devices like CROS aids may involve being judicious about when and where they are used [31]. The lack of consensus around tinnitus could be attributed to it not being a symptom that is experienced by all healthcare users with SSD and only likely modified by some interventions (e.g., a cochlear implant or middle ear implant). As such, it is therefore not an outcome domain that is equally important and critical to all trials of SSD interventions. For SSD trials in which tinnitus is a main outcome of interest, the core outcome domain set for sound-based interventions for tinnitus could be considered [32].

Although adverse events did not meet the criteria for inclusion, we recognise that it is important to rigorously monitor and report adverse aspects of interventions when synthesising evidence [33]. Every healthcare intervention is associated with a risk of harms that should be balanced against therapeutically beneficial outcomes. Despite this, a systematic review showed that reporting of harms’ data in randomised controlled trials across a range of clinical specialties failed to meet the Consolidated Standards of Reporting Trials (CONSORT) harms criteria [34], and is rarely reported in trials of SSD interventions [7]. There may be a number of reasons for this, including the possibility of a mismatch between investigator-led assessment of harms and the experience of patients, harms might be misreported because they are highly diverse, they might be documented in the trial yet under-reported by investigators or influenced by sponsors, and short-term follow-up might fail to spot long-term harms. In our study, we noted that a majority of clinical researchers recognised the importance of reporting adverse events, but fewer healthcare users and healthcare professionals did so. There is no standard way for handling adverse events during core outcome domain set development and different teams have taken different approaches. Some, like CROSSSD, are purely driven by the consensus process [35,36,37], whereas others are driven by panel discussions [38]. Others agreed that adverse events should be reported as per good clinical practice guidelines and are thus relevant to all clinical trials so fall outside of the core outcome domain set concept [39]. A similar situation to CROSSSD arose in the development of multiple core outcome domain sets for tinnitus trials, in that all outcome domains related to intervention-related benefits rather than harms [32]. However, the authors were explicit in highlighting the importance of assessing and reporting harms regardless of their inclusion in the core outcome domain set or not. There is no apparent rationale for why an equal emphasis should not also be put on assessment and reporting of harms in trials of SSD interventions, both to promote patient safety, good clinical practice, and adherence to the CONSORT recommendations, and we would advocate that approach.

Strengths and limitations

The CROSSSD study’s approach to develop a core outcome domain set for SSD interventions in adults, using COMET initiative recommendations [18], proved effective. Effectiveness was demonstrated by low attrition rates at the e-Delphi surveys stage, as well as positive participant feedback. The recommended outcome domains are deemed critically important to measure by all relevant stakeholders, including both healthcare users and professionals. The methodological approaches used at all stages of the study ensured that all opinions were considered and the resulting decisions were not biased towards clinical researchers or healthcare professionals’ views. Future steps will concentrate on identifying or rigorously developing instruments that can successfully measure each outcome domain.

Despite efforts to fully represent stakeholders internationally (e.g,. recruitment strategies linking in ENT and audiology global ambassadors, professional bodies and charities, and the CROSSSD steering group representatives), recruitment for low- and middle-income countries was limited. This fact is an acknowledged challenge in core outcome domain set development, and previous studies have recommended that geographical and income-based differences should be considered in outcome prioritisation [40]. In the current study, the use of a web-based e-Delphi approach presented few barriers to participation and specific attention was given to recruiting stakeholders for the web-based consensus meeting from various backgrounds. Given that most published clinical trials of SSD interventions have been conducted in North America, Australia and Europe, and the CROSSSD study is focused on developing a core outcome domain set for clinical trials, the sample of participants involved in developing the core outcome domain set is representative of the geographical regions in which future research on SSD interventions is likely to take place. However, further work that includes identifying and appraising measurement instruments to assess the outcome domains in the core outcome domain set should ensure that accessibility is considered as part of that process.

Although our recruitment strategy sought to engage a diversity of participants, there was a predominance of CROS aid healthcare users, audiology healthcare professionals, and female participants. These reflect real-world imbalances in current clinical practice. With respect to healthcare users, the greater numbers of CROS aid users is not surprising, since this device is the longest standing, non-surgical remediation solution for SSD [41, 42]. With respect to healthcare professionals, in many of the participating countries, audiologists are the first point of call for SSD interventions because they assess, counsel and rehabilitate hearing aid and auditory implant users.

There was no restriction on the minimum number of years of clinical experience of participating healthcare professionals, which might have had an impact on the choice of outcome domains [43]. Variables such as knowledge of outcome measures and having a master’s level qualification can be influencing factors [44]. In the context of core outcome set development, time for reflection and vicarious thinking were important drivers for score changes when choosing outcome measures in Round 2 [45].

Participants commented on the clarity of definitions and choice of language used. Despite involving public research partners from the inception of the current study, as per recommended standards [18], and having the outcome domain definitions reviewed by the study management team and steering group representatives, we observed there was still ambiguity detected by participants during both the e-Delphi surveys and consensus meeting. This challenge was also noted by colleagues [46], who co-produced plain language descriptors and introduced additional examples to their definitions. Other core outcome set developers have translated their surveys into other languages, aiming to capture several culturally diverse regions and optimise global participation [47] but they found very little variation in opinion within stakeholder groups when participant region and other characteristics were considered [48]. Future core outcome domain set developers could seek feedback on the clarity of definitions from a larger number of stakeholder representatives internationally (e.g., via charities or professional bodies) to address any ambiguity.

Conclusions

The three recommended outcome domains represent what is deemed critically important to always measure as a minimum, and clearly fitted our consensus criteria. However, this list does not restrict clinical trialists from assessing other outcomes. Five other outcome domains (listening effort, one-to-one conversation in general noise, being aware of a sound, device usage, and listening in complex situations) reached the final stages of elimination and were considered highly important by stakeholders throughout the process despite not making it into the core outcome domain set. Therefore, special consideration could be placed on these as well as the core outcome domain set domains when designing future clinical trials. Wide adoption of the resulting core outcome domain set in trials investigating SSD interventions will lead to high-quality, easily comparable trials that are concentrating on important outcomes relevant to all stakeholders involved. Moreover, there is an increasing expectation by funders for justification on the choice of outcome measures [49].

Stakeholder awareness, dissemination, publicity, and promotion of adoption and implementation of the core outcome domain set are ongoing. For example, a short video on YouTube aims to raise awareness of the study outcomes [50]. The well-documented problems of research waste [51] will be addressed, unnecessary duplication of effort and outcome-reporting bias will be eliminated, and evidence synthesis in the SSD field will be enhanced if the core outcome domain set is widely adopted. The next steps will concentrate on determining how these outcomes should be operationalised and measured, i.e., the identification or development of robust instruments that can measure the chosen outcome domains. Consideration will also be placed on the instruments’ relevance to clinical practice from the perspective of healthcare users and professionals. Subsequent development of unified testing guidelines that incorporate the core outcome domain set, appropriate measuring instruments, and time-frame of measurement will eliminate the diversity and inconsistency of reported measures in the field of SSD.