Introduction

Evidence-informed clinical practice guidelines are essential tools to support high-quality decision-making in healthcare [1, 2]. Ideally, they translate a combination of scientific evidence, clinical experience, values, and preferences into clinical practice recommendations that are relevant to the diagnosis and treatment of individual patients [3]. Their validity relies on a transparent development process using internationally accepted standards and methods [2,3,4]. The methods employed should be documented and made accessible and can be assessed using validated tools such as AGREE-II [5].

The trustworthiness of clinical practice guidelines also depends on implicit factors that are not typically recorded in guideline documents. This includes the adequacy of time allotted to each task, panel member selection or conduct, or the adequacy of conflict-of-interest management. Such implicit factors may be evaluated using the PANELVIEW instrument, which was created to evaluate the process and outcome of clinical practice guideline development from the perspective of the guideline group [6]. The instrument itself was developed with a focus on transparency and stakeholder involvement and then tested with eight international guideline groups. It consists of 15 domains with a total of 34 items (see Table 1).

Table 1 Characteristics of study participants

The German evidence- and consensus-based guideline on the treatment of patients with severe/multiple injuries (‘German polytrauma guideline’) [7] is an interdisciplinary and interprofessional guideline initially published in 2002, with updates scheduled every 5 years [8]. The guideline contains over 300 recommendations across 39 topic areas and covers the diagnosis and management of various types of injury in the prehospital, emergency room, and primary surgical setting. It is coordinated and funded by the German Society for Trauma Surgery (DGU), a non-profit scientific medical society, and its development involves delegates representing > 20 scientific medical societies and additional organisations such as emergency services associations. Based on the results of systematic evidence reviews, teams of content experts propose the wording and strength of recommendations using a three-level scheme for grading recommendations endorsed by the Association of the Scientific Medical Societies in Germany (Arbeitsgemeinschaft der Wissenschaftlichen Medizinischen Fachgesellschaften, AWMF) [9]. Delegates of all medical societies and organisations involved (‘guideline group members’) discuss these proposals during structured consensus conferences and vote on the final consensus wording and strength following a structured consensus process with neutral moderation in line with the requirements of the AWMF guidance for guideline projects.

The objective of this study was to evaluate the 2022 update of the German polytrauma guideline from the perspective of the guideline group and to identify areas where this process may be improved in the future. This is the first published application of PANELVIEW to guideline updating following its release. During finalisation of the manuscript, an application of the instrument to guideline adoption/adaptation/development (adolopment) was published by Khabsa et al. [10].

Methods

Study design

This survey study is reported in accordance with the Consensus-Based Checklist for Reporting of Survey Studies (CROSS) [11].

Data collection methods

We used the German-language translation of the PANELVIEW instrument [12]. The instrument consists of 15 domains with a total of 34 items, and agreement with each survey item is rated on a 7-point Likert scale (7: fully agree; 1: fully disagree) [6]. There is an option to select ‘not applicable’. At the end of the survey, free-text fields are provided for comments on other factors that may have influenced the appropriateness of the process and/or satisfaction of the guideline group and further comments on the guideline development process.

Sample characteristics

All contributors to the guideline process who participated in at least one of the five consensus conferences were invited to participate: the steering committee including the guideline chair and methodologist, the interdisciplinary guideline group composed of delegates of participating societies/organisations, and an independent moderator affiliated with the AWMF, as well as clinical experts that led the writing groups for recommendations discussed during the consensus conferences. Eligible persons were encouraged to participate in the survey following each of the consensus conferences they participated in.

Survey administration

The survey was set up by the PANELVIEW team at McMaster University in Canada and conducted online via SurveyMonkey (SurveyMonkey, Momentive Inc, San Mateo, California, USA, www.momentive.ai). Consensus conferences took place on 14 June 2021, 13 September 2021, 26 January, 14 February, and 15 March 2022. Three separate survey links were generated after the first, second, and final consensus conference. The third survey was administered to all participants of the three consensus conferences that took place in 2022. Within up to 7 weeks following completion of the first, second, and set of three final consensus conferences, the guideline office sent a survey link to all participants. A reminder was sent after 2 to 4 weeks, and the survey was closed after 4 to 8 weeks. Responses were voluntary and anonymous.

Ethical considerations

Administration of the PANELVIEW instrument to guideline groups is covered by ethics approval (Hamilton Integrated Research Ethics Board Project #14–867). Publication of the aggregate results was approved by the guideline-developing organization (German Society for Trauma Surgery (DGU)). No additional ethics approval was required because this survey was a process evaluation, the subjects were experts surveyed on a subject deemed to be within their professional competence, and the survey was voluntary and anonymous. Potential participants were informed upfront that by completing the survey, they gave their agreement that the data collected would be used for the study and would be summarized in aggregate in a publication.

Statistical analysis

Likert scores were analysed by computing the mean with its standard deviation for each item using Excel (Microsoft Excel for Microsoft 365 MSO, Version 2208). No data were imputed. All comments obtained from the free-text fields were documented and used for internal quality improvement purposes.

Results

Fifty-seven persons participated in at least one consensus conference and received an invitation to the survey. Among them, 21 (36.8%) received invitations for one, 9 (15.8%) for two, and 27 (47%) for all three survey rounds, so there was substantial overlap in potential survey respondents. Response rates were 36% (n/N = 13/36) for the first consensus conference, 40% (12/30) for the second, and 37% (20/54) for the set of three final consensus conferences. The number of participants (N) depended on the number of topics discussed in each consensus conference, because those topic experts who were not delegates of participating societies participated only in some consensus conferences. N is higher especially for the third survey round, because this included three separate consensus conferences with multiple topics.

Characteristics of participants who responded to the survey are detailed in Table 1. Respondents were mainly clinical experts. Other roles included the panel chair/co-chair, methodologists, and steering group members. This reflects the composition of the guideline group. No patient representatives agreed to participate in the development of this guideline, and therefore, none were available to participate in the survey. Around three quarters of participants had previous experience in guideline development, most often as panel members. The majority had a formal education in medicine, others in nursing, epidemiology, or natural sciences. Formal training in research methodology ranged from some training to a PhD degree. Participants were mainly clinical/health professionals; many were also active in research and/or teaching.

The mean scores for items ranged from 5.1 to 6.9 on a scale from 1 (fully disagree) to 7 (fully agree), as shown in Table 2. Items with mean scores below 6.0 were related to (a) four items in the domain ‘administration’, (b) consideration of patients’ views, perspectives, values, and preferences, and (c) the discussion of research gaps and needs for future research.

Table 2 Responses collected by the guideline group during consensus conferences 1, 2, and 3–5

Discussion

The responses collected by the guideline group after each consensus conference were very positive overall, with a mean score for most items between 6 and 7. Between the three surveys, no major differences or trends over time were observable. The good results may be associated with the experience of the participants, around three quarters of whom had prior experience with guideline development, several within the same guideline group. In addition, this was a guideline update and largely followed previously established processes.

Items with slightly lower mean scores between 5 and 6 were related to administration, consideration of patients’ perspectives, and the discussion of research gaps. The lower mean scores in the field of administration may in part be associated with the high workload for guideline group members in this guideline: There were five all-day online consensus conferences, during which more than 300 recommendations were discussed, along with all required preparation and follow-up tasks. Other possible reasons may include technical issues with the content management system and with communication processes. Also due to the high workload, the discussion was highly focused on the evidence and recommendations, and there was little time left to address research gaps. The direct consideration of patient views was not possible because the guideline group did not include a patient representative. Patient involvement in this guideline update was planned, but we did not succeed in recruiting patients. A reason for this is the absence of patient organisations or self-help groups in the area of major trauma in Germany.

The response rate to the PANELVIEW survey was in the range of 35 to 40%. There are currently no documented figures to compare this with, though the developers of PANELVIEW observed response rates of 50–90% (personal communication). Our response rate appears adequate in the context of the substantial workload for guideline group members and the voluntary nature of the survey.

PANELVIEW has previously been used in a pilot test of eight guideline panels consisting of 94 panellists reported by Wiercioch et al. [6]. In these panels, described as ‘high performing’, mean scores for items ranged from 5.5 to 6.8. These values are similar to those we observed. Several items with mean scores below 6.0 overlap between our results and those published by Wiercioch et al., i.e. those related to administration and to the consideration of patients’ perspectives. Mean scores were higher in our project for management of potential bias in panel members’ interpretation of evidence (item 9), involvement and consultation with key stakeholders (item 17), and writing of the guideline (item 31). Khabsa et al. recently reported an average of 6.47 (SD = 0.18) across all items for their guideline adolopment process for the treatment of rheumatoid arthritis (RA) in Saudi Arabia [10]. This is also comparable to our result, although we did not calculate an average across all items. Similar to our study, the RA guideline was also supported by a very experienced guideline development group.

The study has some limitations. First, the instrument and its German-language translation has been developed using rigorous methods, but is not validated yet. Second, the threshold (mean scores below 6.0) we used to identify areas for quality improvement of the guideline was chosen arbitrarily. With a lower value e.g. of 5.0, all items would have been above the threshold. Finally, the surveys were started between 3 days and 7 weeks after the consensus conferences. However, it appears that this variable time lag has not substantially affected the response rates.

Conclusion

The guideline development group of the German polytrauma guideline was satisfied overall with the development process, methods, and outcomes, as evaluated by the PANELVIEW instrument. Areas for future quality improvement include administration, the involvement of patients, and the discussion of needs for future research.