Background

Clinical practice guidelines (CPG) are developed to align standards of health care around the world, aiming to reduce the incidence of misconducts and enabling more effective use of health resources [1].

The process of creating a CPG with reliable and high-quality recommendations requires methodological rigor in the fulfillment of a series of steps in a systematic and organized manner [2].

The initial step is to define which clinical question and population will be addressed by CPG and to select which outcome variables should be assessed to answer that question [3]. A systematic review is conducted to retrieve the best available evidence. The data obtained are synthesized and analyzed, generating a body of evidence and an estimate of effect. After grading the quality of the evidence and the strength of the recommendations, CPG authors seek consensus for final drafting of the guideline. Finally, proposed guidelines should be disseminated and implemented, and their content should be regularly updated as new evidence emerges [4].

In the field of rheumatology, CPG have special relevance. The growing knowledge about the pathophysiology and natural history of immune-mediated rheumatic diseases, the modernization of complementary research tools and the profusion of new therapeutic options generate a wide range of evidence and, with them, many clinical doubts [5]. The CPG are intended to help rheumatologists to keep acquainted with the best evidence and use it as a basis for patient care decisions [5].

The participation of the clinical rheumatologist with expertise in managing the population of interest is essential. Thus, the team of authors of the CPG should count not only on professionals with extensive knowledge about its methodology, but also on rheumatologists with extensive clinical know-how [5]. At the same time, it is important for rheumatologists to have some degree of technical and methodological knowledge to be able to critically evaluate the recommendations in the CPG [5].

The literature has many studies and tools related to the methodology and quality analysis of CPG. This information, however, is scattered in different articles and publications and may present a complex and poorly understood technical language. Therefore, rheumatologists may be very reluctant to participate in the development of CPG or to trust and adhere to the proposed recommendations [2, 5].

Objectives

The Epidemiology Commission of the Brazilian Society of Rheumatology provides this methodological guide for the elaboration, development, quality assessment and updating of CPG in rheumatology.

Based on an extensive narrative review of the literature, the objective is to clarify this process in a practical and accessible way for the clinical rheumatologist to support the interpretation and quality evaluation of CPG, as well as to encourage rheumatologists to participate in the elaboration of CPG.

Drafting the scope of the guideline

The development of CPG starts with a clear definition of the scope of the guideline, the roadmap that will guide and delimit all the next steps to be taken [2].

This stage is of crucial importance, from which the authors will define the size of the project and the working team; the target population and the setting or scenario where the guideline will be implemented; and which study designs and outcome measures should be included [2].

It is a complex, expensive and time-consuming process. For better use of resources, it is suggested that the scope of the guideline addresses topics in the following situations: (a) There is a need to update existing guidelines to incorporate new diagnostic technologies or new therapeutic options; (b) The development of a new guideline will result in a potential improvement in the quality of patient care; (c) There is good quality evidence to support the practices and treatments to be recommended by the guideline; (d) The management of a given health condition is surrounded by uncertainties, resulting in significant practical inconsistencies among health professionals; (e) There are strategic areas for the health system or a health institution that require better and consistent care protocols [6].

Modifying and adapting a good quality pre-existing guideline rather than building a new one from scratch can be an excellent approach to reduce duplication of effort, enhance efficiency, and improve CPG utilization [7, 8].

The international collaboration ADAPTE has developed the Resource Toolkit for Guideline Adaptation, a manual that supports the adaptation of CPG to environments other than the original, with a systematic approach and careful contextualization [7].

Working groups assignment: the management committee, the elaboration group and the panelist group

The members of the guideline elaboration team must compose a management committee, an elaboration group, and a panelist group. These groups are composed of methodologists, health economists, a systematic review team, clinical experts, and administrative support team. They might work collaboratively themselves, involving the consumer and stakeholders.

The management committee must include members with expertise in systematic review, epidemiology, and public health, and who are familiar with the research subject, as well as members of the institution that commissioned the guideline. It is a committee that oversees all steps throughout the process. It should be the first group to be formed, and it is recommended to include 4–10 people. The management committee's functions are to coordinate the guideline scope elaboration process; to identify and invite the members to compose the elaboration group; to monitor the development of the guideline; and to review and approve the final version of the guideline [9].

The elaboration group must be composed of specialists with experience in the critical evaluation of scientific articles, in evidence-based health and in the performance of systematic reviews. This profile includes epidemiologists or methodologists with knowledge on evidence synthesis for health-related decision making [9]. It should include 8–12 people who will perform the following main tasks: to search and critically evaluate the evidence to support the recommendations; to formulate recommendations; to select and train panelists in terms of working methods, and to assess and incorporate suggestions from external reviews. The elaboration group must also have a leader who will participate in the management committee, facilitating interaction among the working group members so that the process is carried out in a collaborative environment [9].

The panelist group should be multidisciplinary and include health managers, health professionals, specialists in the guideline theme, economists, and patients, providing the point of view of those to whom the recommendations are directed. It is recommended that members have diversified knowledge to assess the realities of the different geographic regions and their populations. In addition, it is important to consider both the vision of the primary care professional and the one who works in specialized care, as well as the access to technology both by the public system and the complementary system. Ideally, this group should have 6 to 10 members and should assist other members involved in the guideline development and formulation of the recommendations [9].

It is important to point out that everyone involved in the guideline elaboration must complete a conflicts of interest (CI) statement before the work begins. Such statements must be available to the entire drafting team. The declaration of financial and non-financial competing interests must be mandatory. Any change in the CI during the guideline development must be communicated to the management committee and shared with the entire team. A member with a substantial CI should not be the leader of the elaboration group or participate in the management committee. A member of the elaboration group or panelist group with potential CI can still participate in the process, if those CI are transparently disclosed. CI must be declared in the appendix of the guideline [4].

Research question and the PICO model

The research questions (or key questions) must befit the scope defined by the guideline management committee. The PICO model, an acronym for Population, Intervention, Comparator, and Outcomes, captures the key elements of the research question, enabling the design of a broad and clear research strategy, consistent with CPG objectives [9, 10].

P: Population

It corresponds to the population, problem or health condition. “Who are the relevant population?” “What are the characteristics of the population?” “Are there subgroups that need to be considered?” The most challenging decision when framing the research question is to define the population for which the intervention will be applied [10, 11]. For example, in addressing the effect of a new immunobiological medication for the treatment of rheumatoid arthritis, one might include only patients who did not respond to other available immunobiological medications or who did not respond to nonbiologic disease-modifying antirheumatic drugs (DMARDs). The magnitude of effect on key outcomes may be different depending on the population chosen. In that case, the guideline will generate misleading estimates for at least some subpopulations of patients and interventions [10].

I: Intervention

“Which intervention will be evaluated?” The description of the intervention must mention its availability in the Brazilian Unified Public Healthcare System (SUS), and/or its coverage by supplementary health, regulated by the National Health Agency [12]. Registration and evidence of intervention with the Brazilian regulatory agency, Agência Nacional de Vigilância Sanitária (ANVISA), must be reported. Information from regulatory agencies in other countries is also recommended [12].

C: Comparator

“What is the main intervention considered standard to compare with the new intervention under consideration?” The comparator generally refers to the standard care for the condition being studied or placebo. Whenever possible, the comparator should always be the one already available at the Brazilian Unified Public Healthcare System (SUS) or supplementary health care system for the same clinical situation [12].

The comparator should be clear and evident, facilitating interpretation of the recommendations proposed by the CPG [10]. When there are multiple comparators, it should be clear whether all agents are equally recommended or whether there is superiority of one over the other [10].

O: Outcomes

“What is really important to the patient?” Final (hard) outcomes such as mortality, survival, morbidity, quality of life, treatment complications or adverse effects should be prioritized over surrogate outcomes [10]. Harms associated with diagnostic tests or treatment strategies, patient-reported outcome and outcomes that evaluates public health impacts are of growing relevance in literature [10]. It is important to note that the choice of outcomes of interest in a study is subject to cultural and regional influences, and if there is little evidence on an important outcome, this fact should be acknowledged rather than excluding this outcome [10].

Guideline panels using GRADE will consider the importance of outcomes in three steps. First, a preliminary classification of outcomes must be done before the review of the literature [10]. Using the GRADE method, the elaboration group and the panelist group must analyze all outcomes chosen for each PICO question and classify them with scores between 1 and 9. On this scale, outcomes with scores of 7–9 are classified as “critical” for decision making; those with scores between 4 and 6 are classified as “important but not critical”; and those classified with scores between 1 and 3 are classified as low importance for decision making. The hierarchy of outcomes will be important in judging the quality of the body of evidence by GRADE.

It is also important to define the type of study design that will be considered to better answer each question. Therefore, the variant called PICOS is also used, where the letter “S” stands for “study design”.

Search, identification and selection of relevant studies

After assembling the PICO question, the next step in developing a guideline is to search the literature for relevant studies that meet the eligibility criteria [13].

CPG authors should specify in advance which study designs best answer the research question, what specific characteristics of the population, the intervention and the comparison elements best serve the purpose of the guideline and other important studies characteristics must be observed for their inclusion, such as language and publication situation [13].

These eligibility criteria must be broad enough to encompass the great diversity of studies available in the literature but restricted enough so that the studies can be grouped and compared in the data synthesis and analysis stage [13].

The search phase is the major foundation that will ground the quality of a guideline. A well-conducted and systematized search strategy ensures that most relevant papers are retrieved, minimizing reporting bias and achieving more reliable estimates of effects and uncertainties [13]. It is recommended that authors work closely with a librarian or health information specialist from the beginning of the literature search [13].

Databases and other sources

A broad, clear, and reproducible search strategy should be devised, covering the most established health-related databases and also unpublished data sources, ongoing studies and grey literature [9]. The more databases that are searched, the more laborious the review will be, so optimizing this choice is critical [13, 14].

MEDLINE and EMBASE are the most used databases worldwide, and there is little overlap in retrieving references on musculoskeletal disorders between them [13, 14]. Other important search sources are The Cochrane Library, the LILACS (a highly recommended Latin-American database), and subject-specific databases such as CINAHL, PsycINFO or PEDro [13, 15].

Trials registers (such as ClinicalTrials.gov), grey literature (reports, dissertations, theses, and conference abstracts) and manual searching of reference lists of included studies must also be part of the search strategy [13].

Descriptors, Boolean operators and search filters

A search strategy must contain the appropriate terms and descriptors to find the elements that constitute the PICO question. Databases may be searched using a combination of two retrieval approaches [13].

  • Text words: search for direct free-text terms occurring in the title or abstract.

  • Descriptors: or subject terms used by each database to “officially” label a particular concept.

The use of descriptors aims to increase the number of retrieved studies without substantially increasing the number of non-relevant references [14]. Each database has its own controlled vocabulary: MeSH (MEDLINE and Cochrane Library), EmTREE (EMBASE) and DeCS (LILACS).

Once all free-text terms and controlled vocabulary terms are chosen, they must be combined by logical Boolean operators: AND, OR and NOT. The AND operator combines terms so that the database retrieves studies that contain all of them. The OR operator matches keywords so that the database retrieves studies that contain one or all of them in the search results. When using the OR operator, all keywords must be enclosed in parentheses. The NOT operator is used to exclude keywords from search results and should be avoided whenever possible because of the risk of inadvertently removing relevant records from the search set [13].

It is also possible to use search filters to further delimit the studies retrieved by language, age group, study design and others. Search filters in checkbox format are available on databases websites but it is highly recommended to use filters validated and tested by groups of experts such as Cochrane and the InterTASC Information Specialists Group [16].

Tools such as the Peer Review of Electronic Research Strategies (PRESS) checklist can help with the steps of designing the search strategy, which must be reported in sufficient detail so that it can be reproduceable [13, 17].

Study selection

The study selection process must be carried out by two independent reviewers, and disagreements may require a third person arbitration. Studies retrieved from all sources and databases are merged, duplicates are removed, and researchers review titles and abstracts and then full texts, selecting studies that meet the eligibility criteria for inclusion. This process must be documented step by step using the PRISMA flow diagram [13, 18].

Several tools and software such as Rayyan QCRI, DistillerSR and Covidence have been developed to facilitate the process of study selection [13, 19,20,21].

Assessment of the quality of the body of evidence

Assessment of the quality of the included studies should be done by two independent reviewers, and disagreements may require a third person arbitration or resolved by consensus. There are tools specially developed for each study design, for example AMSTAR for systematic reviews; QUADAS-2 for accuracy studies; Newcastle–Ottawa tool for case control or cohort studies; ROBINS-I for non-randomized studies of interventions and RoB 2 for clinical trials [13, 22,23,24,25,26].

The GRADE assessment

GRADE tool allows authors to assess the quality of evidence and the strength of recommendations in a given study [27, 28]. It provides guidance in the process of determining the outcomes of interest, summarizing the evidence, and formulating a recommendation in an outcome-focused manner. The GRADE classification is made for each outcome and quality may differ from one outcome to another [27, 28]. Furthermore, GRADE classification makes it possible to assess the degree of recommendation of a certain conduct in clinical practice, as well as for decision-making in public and private health policies.

The GRADE tool rates the quality of evidence in one of four levels—high, moderate, low, and very low [27, 29, 30]. The quality of an evidence may be compromised by imprecision (very large confidence intervals), inconsistency (or heterogeneity), the indirect nature of the evidence (correspondence to PICO and applicability), and publication bias [27]. Likewise, there are factors that can increase confidence in effect estimates: when there is a large magnitude of effect, when plausible residual confounders and biases would reduce the demonstrated effect or increase the effect if no effect was observed, or when there is an evidence of a dose–response gradient [27].

Direction and strength of the recommendations

The GRADE approach rates a recommendation as strong or weak. To determine the strength of recommendation, it is necessary to critically evaluate the desirable and undesirable effects of an alternative strategy, analyze the quality of the primary studies, the preferences of the scenario where the new strategy will be applied and the rational use of resources [31,32,33].

The GRADE evidence profile tables should be used to inform the final decision on the quality of the body of evidence for each outcome and to make the quality rating explicit and reproducible [33].

Achieving consensus: the Delphi model

Some health issues are under-explored due to ethical or logistical difficulties, generating low-quality, insufficient, or contradictory evidence. Writing a guideline on these topics represents a major challenge, since it is not possible to gather such evidence in a body and analyze it through objective techniques such as GRADE [34, 35].

In these scenarios, the development of recommendations can be based on the clinical experience of highly qualified professionals using The Delphi Method, a structured technique, which anonymously and systematically collects the opinions of panelists, generating a reliable consensus with statistical value [35].

The Delphi process is based on the application of interactive questionnaires to a panel of professionals, for several rounds until the divergence between opinions has been reduced to a satisfactory level [36]. The possibility of application in virtual format allows the participation of specialists from different regions of the globe and reduces costs. The anonymity of responses mitigates the influence of the most prominent panelists, allowing a more homogeneous and proportional participation of all professionals [37,38,39]. Other important advantages of the method are providing the participants with feedback for their contributions, the possibility of reviewing the experts' answers and the formation of heterogeneous groups, with different clinical experiences [37,38,39].

Limitations of the Delphi model includes the possibility of bias due to personal interests or preconceived opinions of both investigators leading the process or participants [34, 40].

The sequence for developing the Delphi model is succinctly described in the Table 1.

Table 1 Checklist for applying the Delphi model [36]

Writing and reporting a guideline

After gathering and analyzing the evidence of each PICO question, the guideline full text must be prepared, describing minutely all the steps adopted to reach the conclusion and pertinent recommendations. Summarized versions of the guideline should be made for specific audiences [41].

Submitting a guideline draft for external review or public consultation may be recommended. The management committee evaluates the revised text and ultimate adjustments can be made by the elaboration group prior to publication of the guideline final version.

Dissemination and implementation of a guideline

To incorporate the recommendations of a guideline into clinical practice it is essential to propagate the study results to a variety of healthcare professionals and patients. Special importance is given to the Cochrane EPOC group, which provides active strategies and other arrangements for implementing health guidelines recommendations [42].

Updating a guideline

Since knowledge is dynamic and is constantly evolving, efforts to keep CPG updated are necessary to maintain the validity of recommendations. Methods of CPG development have progressed substantially in the past 20 years, but guidance for updating CPG remains heterogeneous and poorly described in methodological handbooks [43]. Therefore, there is no consensus about how frequently CPG should be updated, although most authors recommend assessing the need for an update every three to five years [43].

Table 2 summarizes a systematic approach for deciding whether CPG might need updating and a simplified updating process framework [43, 44].

Table 2 Basic steps for the updating process of clinical practice guideline [43, 44]

Quality assessment of guidelines

A high-quality guideline should ensure that potential biases are adequately addressed and that the resulting recommendations are feasible, with internal and external validity.

Currently, the main tools available for the reporting of guidelines development are the Appraisal of Guidelines for Research and Evaluation II (AGREE II) instrument and the Reporting Items of Practice Guidelines in Healthcare (RIGHT) statement [45, 46].

The AGREE II instrument consists of 23 items comprising 6 quality domains and has 3 main objectives: to provide a methodological strategy for the elaboration of a clinical guideline, to guide how it should be adequately reported and to assess its quality [45, 47]. Details on how to calculate these scores as well as translated versions are available on the AGREE Enterprise website [48].

The RIGHT statement includes 22 items considered essential for good reporting of CPG [47, 49]. Both AGREE II and RIGHT may be used interchangeably to improve the comprehensiveness, completeness, and transparency of CPG divulgence [46]. Their main features are summarized on Table 3.

Table 3 RIGHT Statement and AGREE Reporting Checklist domains and essential items [47, 49]

Another instrument of great relevance in reiterating the quality of CPG is the GIN-McMaster Guideline Development Checklist, which consists of 18 topics addressing all stages of a guideline elaboration, from planning to implementation and evaluation [50]. The purpose of this instrument is not to replace the AGREE II, but to ensure that, following the steps described in the checklist, key items are covered and higher scores are achieved in the credibility assessment tools [51].

Conclusions

Guidelines provide support for decision-making in clinical practice. Guideline development is based on the selection and synthesis of the best available evidence, using a systematic and transparent approach in the judgment of the quality of evidence and strength of recommendations. The methodological guide proposed in this review could be used as an important tool for creating, elaborating, and updating CPG and consensus in rheumatology. Figure 1 presents a roadmap listing the steps described in this paper for developing a guideline.

Fig. 1
figure 1

Brazilian Society of Rheumatology methodological guide for the development of evidence-based CPGs in rheumatology roadmap