Background

Inflammatory bowel disease (IBD) is a family of diseases characterized by chronic inflammation and immunological dysregulation in the gastrointestinal tract which includes but is not limited to Crohn’s disease (CD) and ulcerative colitis (UC) [1]. CD is caused by inflammation of the entire gastrointestinal tract, and it primarily affects the small intestine whereas UC is limited to the large intestine and rectum [2, 3]. Signs and symptoms of both diseases include abdominal pain, diarrhea, fatigue, weight loss, and bloody stools [2]. IBD typically emerges in early adulthood and persists over the patient’s lifespan as continuous cycles of remission and relapse [1, 3]. Due to the chronic nature of this disease, quality of life, social functioning, and the ability to work are all severely impacted for patients with IBD [1, 4]. Additionally, IBD can present a profound psychosocial burden on mental health, with patients describing social isolation, loss of bowel control, impairment of body image, and fear of dependency as factors contributing to emotional distress [5,6,7]. Patients with IBD have high rates of anxiety and depression [7]; further, one study demonstrated patients with complex IBD may have greater prevalence of depression and poorer perceived health than those with uncomplicated IBD [6]. Approximately 1.5 million Americans and 2.2 million Europeans have been diagnosed with IBD [3, 4], and with increasing prevalence worldwide, IBD is emerging as a new burden on healthcare systems [8]. The etiology of IBD is unknown but complex interactions between genetic susceptibility, age, environment (e.g., stress, diet, or hygiene), and dysbiosis of the gut microbiota all contribute towards the development of IBD [1]. Many current conventional treatments take an anti-inflammatory approach to achieve and maintain remission in IBD, through a wide variety of treatments including immunomodulators, steroids, biologic agents (e.g., monoclonal antibodies), and surgical interventions [9,10,11]. However, long-term remission and symptom management remains a challenge with many current conventional treatments (e.g., steroids), while often having undesirable adverse effects (e.g., steroid-induced hyperglycemia, increased infection risk) [11, 12]. Correspondingly, patient concerns about treatment adverse effects is one factor associated with treatment noncompliance in many inflammatory conditions (including IBD), which can result in negative patient health outcomes [13]. Further, adverse effects from conventional IBD treatment is a predictor of using complementary and alternative medicine (CAM) among many patients with IBD [14,15,16].

CAMs are a diverse group of non-mainstream therapies and practices that fall outside the purview of conventional medicine [17, 18]. Complementary medicine refers to non-conventional treatments used in conjunction with conventional treatments, while alternative medicine refers to non-conventional treatments used in place of conventional treatments [17, 18]. While 21 to 60% of patients with IBD have reported CAM use [10, 19,20,21,22,23], many of these patients do not disclose their CAM usage with their healthcare providers [19]. Accordingly, many healthcare professionals have limited knowledge of CAM treatments, where better understanding of the evidence for current IBD CAM treatments can be important for better patient outcomes [10].

Nutritional therapeutics (e.g., herbs and dietary supplements) are the most common CAM therapy used by patients with IBD [10, 19,20,21,22,23]. Among the nutritional therapeutics, probiotics are the most commonly recommended CAM therapy for IBD by gastroenterologists due to its anti-inflammatory and immunomodulatory properties to reduce inflammation of the gastrointestinal tract [10, 19, 23]. One Italian double-blinded randomized control trial demonstrated the efficacy of VSL#3, a probiotic mixture of eight bacterial strains, for IBD [24]. The combination of VSL#3 with conventional medicine (e.g., aminosalicylic acid, immunosuppressants) was more effective in treating IBD than conventional medicine alone [9, 10, 24]. Curcumin, a phytochemical which is commonly found in turmeric, has also been proposed as a CAM therapy for IBD based on its reported anti-inflammatory and anti-oxidative properties on human white blood cells [10, 19, 21,22,23]. There is, however, limited research on curcumin’s efficacy and dosing [25]. Mind-body practices such as mindfulness, hypnosis, meditation, and yoga are CAM interventions that aim at reducing stress, a potential contributor to IBD development [10, 19,20,21,22]. While there is some promising evidence that CAMs may be effective in treating IBD, clinicians generally do not have sufficient training and knowledge to propose or implement CAM regimens to patient treatment plans [23].

Clinical practice guidelines (CPGs) are systematically developed statements often used by healthcare professionals to make recommendations for the treatment and/or management of various conditions, including IBD [26]. Evidence-based CPGs describe guidelines that undertake a systematic literature search, where recommendations are linked to evidence identified through the literature review [27]. Due to insufficient clinician training and expertise in CAMs, CPGs for CAM use in IBD would serve as a beneficial instrument to clinicians working with patients with IBD [28]. To our knowledge, no studies have analyzed the quality of recommendations on CAMs that are found within CPGs for IBD. Therefore, the purpose of this study was to conduct a systematic review to determine the quantity and assess the quality of CPGs providing CAM recommendations made for the treatment and/or management of IBD using the Appraisal of Guidelines for Research and Evaluation II (AGREE II) instrument.

Methods

Approach

A systematic review to identify CPGs providing recommendations for the treatment and/or management of IBD was conducted using Cochrane’s standard methods [29] and reported with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) criteria [30]. A protocol for this study was registered with PROSPERO under registration number CRD42020182234. Eligible CPGs with CAM recommendations were assessed twice with the validated AGREE II instrument [31,32,33], evaluating both the overall CPG and the CPG’s CAM sections. The AGREE II instrument contains 23 items to assess which are grouped into one of the following domains: scope and purpose, stakeholder involvement, rigor of development, clarity of presentation, applicability, and editorial independence.

Eligibility criteria

Eligible CPGs were those that mention the treatment and/or management of any type of IBD, focusing on populations of adults 18 years of age and older. However, CPGs focusing primarily on special populations (e.g., pediatric, geriatric, pregnant, COVID-19 patients) were excluded. Eligible CPGs were also determined using the following criteria: developed by non-profit organizations (e.g., government agencies, or professional associations or societies); published in 2011 or later, published in English language; and publicly available. CPGs were deemed ineligible if they were published as protocols, abstracts, conference proceedings, letters or editorials; based on primary studies that assessed IBD treatment and/or management; or focused on IBD curriculum, education, training, research, professional certification or performance. If a guideline summary was found, efforts were made to retrieve the full-length guideline, however, the summaries themselves were excluded. Furthermore, if a CPG had been updated multiple times, only the most updated full version of the CPG was assessed. One important note is that the AGREE II instrument was only used to assess eligible CPGs with CAM recommendations in order to establish the difference in scores for CAM-specific sections relative to the entire CPG.

Searching and screening

MEDLINE, EMBASE, and CINAHL were searched on May 20, 2022, from 2011 to May 19, 2022, inclusive. The search strategy (Supplementary File 1) included keywords that reflect terms typically used in the literature to describe IBD. The Guidelines International Network [34], an online repository of guidelines, was searched for eligible CPGs using the following keyword searches: “inflammatory bowel diseases”, “IBD”, “Crohn’s disease”, and “ulcerative colitis.” A search was also performed on the NCCIH website [35], which contains a series of CPGs with CAM recommendations for various conditions. All results were exported into Microsoft Excel for screening. A pilot test for title and abstract screening was performed independently by MCW and HL, followed by an audit by JYN. Following the pilot, MCW and HL screened all titles and abstracts (independently and in duplicate), followed by full text screening by MCW and HL (independently and in duplicate) to evaluate CPG eligibility. Following each step of independent screening, MCW and HL met to resolve discrepancies, and JYN reviewed the screened titles and abstracts and full-text articles, as well as assisted in resolving discrepancies that could not be resolved by MCW and HL.

Data extraction and analysis

In a data extraction spreadsheet, the following general characteristics were retrieved and summarized from each of the CPGs: publication date; country origin of study; category of organization responsible for publishing the CPG (academic institutions, government agencies, disease-specific foundations, or professional associations or societies); and the presence of CAM mention or recommendations in this guideline (i.e., yes or no). On the condition that CAMs were mentioned in a CPG, the following data were also extracted: category of mentioned CAMs, proposed CAM recommendations, CAM funding sources, and the presence of conflicts of interests (e.g., CAM providers contributing to the guideline panel). To further assess CPG applicability, each developer’s website was evaluated for any knowledge-based resources that corroborated guideline implementation. Data extraction of all CPGs occurred independently and in duplicate by MCW and HL. Following independent data extraction, MCW and HL met to resolve differences; JYN reviewed all extracted data and assisted in resolving any discrepancies unresolved by MCW and HL.

Guideline quality assessment

Data from eligible CPGs was extracted and analyzed with the AGREE II instrument in accordance with standardized methods [31,32,33]. JYN, MCW, and HL conducted a pilot test of the AGREE II instrument by independently assessing three CPGs with the AGREE II instrument. All three evaluators met to resolve any discrepancies. Then, all eligible CPGs containing CAM therapy recommendations were assessed twice—once for the overall CPG, and once for the CAM-specific portion of the CPG—by both MCW and HL. All assessments were performed independently and in duplicate. CPGs were scored based on 23 items from six domains using a seven-point Likert scale from strongly disagree (1) to strongly agree (7) to determine if each item was met. Overall quality of the CPG was also rated from 1 to 7, which was used to recommend for or against the use of each CPG. Modified AGREE II questions were piloted by a team of researchers familiar with CPGs prior to the initiative of this study (see Supplementary File 2); these modified questions were then used to score the CAM-specific portions of each CPG. JYN helped to resolve scoring discrepancies between MCW and HL. The average assessment scores were determined by computing the average rating of a single evaluator for all 23 items of a single CPG, then averaging this value for both evaluators. The average of both evaluators’ “overall guideline assessment” ratings for each CPG was used to obtain average overall scores. Scaled domain percentages were generated by summing ratings of items within each domain as given by the two evaluators, followed by standardizing the score and converting it to a percentage. Each CPG’s average assessment scores, average overall scores, and scaled domain percentages were compiled for comparison.

Results

Search results (Fig. 1)

Searches retrieved 563 items, of which 490 were unique following deduplication. A further 341 items were eliminated based on abstract screening, yielding 149 items whose full texts were considered. Fifty-one items were considered eligible, with 98 items eliminated for the following reasons: 40 were not CPGs, 19 were on a non-IBD topic, 12 were non-English, 16 were CPGs primarily focused on a special population (e.g., pediatric, geriatric, pregnant, COVID-19 patients), 6 were guideline summaries, 4 were not most recent full updated CPGs, and 1 was irretrievable through public access or library systems.

From 51 eligible items [36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86], 26 made no mention of CAM, 4 only made mention of CAM, and 21 made both CAM mention and provided CAM recommendations. One pair of items [37, 38] were considered as one CPG, rather than two, since each item formed the first and second parts of a guideline series. Additionally, a second pair of items [56, 57] were considered as one CPG, as the two articles were dually published but contained identical content. Of these pairs, one pair [37, 38] did not make mention of CAM or provide CAM recommendations, while the other [56, 57] provided CAM recommendations. Hence, in total, there were 49 eligible CPGs, whereby 26 CPGs made no mention of CAM, 4 CPGs only made mention of CAM, and 19 CPGs made both made mention of CAM and provided CAM recommendations.

Guideline characteristics (Table 1)

Table 1 Characteristics of Eligible Guidelines

Eligible CPGs were published from 2011 to 2022 in the United States (n = 17), Austria (n = 5), Canada (n = 5), the United Kingdom (n = 4), Italy (n = 3), Brazil (n = 2), Japan (n = 2), China (n = 1), Denmark (n = 1), France (n = 1), Germany (n = 1), India (n = 1), Luxembourg (n = 1), Poland (n = 1), Spain (n = 1), and New Zealand (n = 1). Additionally, two CPGs had multiple guideline publishing organizations with headquarters based in different countries [39, 43]. Eligible CPGs were funded and/or developed by professional associations or societies (n = 46), disease-specific foundations (n = 2), and a government agency (n = 1). Four CPGs only mentioned CAM, discussing probiotics (n = 3), curcumin (n = 1), dietary therapies (n = 1), and fecal microbiota transplantation (n = 1).

The AGREE II tool was applied to CPGs making CAM recommendations. Of the nineteen CPGs which provided CAM recommendations, these included probiotics (n = 11), fecal microbiota transplantation (n = 5), calcium (n = 4), vitamin D (n = 4), iron (n = 3), cannabis (n = 2), curcumin (n = 2), nutrition therapy (n = 2), omega-3 fatty acids (n = 3), high-fibre diet (n = 2), mind-body medicine (n = 2), acupuncture (n = 1), adipose-derived stem cells (n = 1), chamomile (n = 1), gluten-free diet (n = 1), hypnotherapy (n = 1), ispaghula (n = 1), low-fat diet (n = 1), myrrh (n = 1), other herbal therapies (n = 1), and vegetarian diet (n = 1). Figure 2 summarizes all the CAM recommendations by their corresponding CPGs, for the benefit of clinicians and researchers. Out of the 19 CPGs, only one CPG [45] had CAM practitioners on the guideline development panel.

Guidelines mentioning CAM without recommendations

There were four CPGs that only made mention of CAM [58, 64, 65, 78]. One CPG noted the short-lasting effects of dietary therapies on CD inflammation reduction [65]. Another discussed a meta-analysis that found no benefits in using probiotics to induce remission, while also briefly mentioning curcumin without further elaboration on efficacy [58]. One CPG mentioned fecal microbiota transplantation as being investigated for treating pouchitis [64]. Three CPGs all discussed the limited evidence showing the probiotic VSL#3 to be effective in maintaining remission for pouchitis in patients with IBD [58, 64, 78].

Average appraisal scores, average overall assessments and recommendations regarding use of guidelines: overall guideline (Table 2)

Table 2 Average Appraisal Scores and Average Overall Assessments of Each Guideline

Average appraisal scores and average overall assessments, evaluated on a seven-point Likert scale, are given in Table 2 for the 19 CPGs assessed using the AGREE II instrument. On the Likert scale, 1 indicates strongly disagreeing, while 7 indicates strongly agreeing, that an item’s criteria were met. Average appraisal scores ranged from 3.2 to 5.5, where 12 CPGs had an average appraisal score of ≥ 4.0 and four CPGs had an average appraisal score of ≥ 5.0. Average overall assessments ranged from 3.0 to 5.5, where 14 CPGs had an average overall assessment of ≥ 4.0 and nine CPGs had an average overall assessment of ≥ 5.0. Five CPGs [48, 69, 70, 72, 73] had an overall assessment of ≤ 4.0.

Average appraisal scores, average overall assessments and recommendations regarding use of guidelines: CAM sections (Table 2)

Average appraisal scores and average overall assessments for CPGs’ CAM sections, evaluated on a seven-point Likert scale, are shown in Table 2 for the 19 CPGs assessed using the AGREE II instrument. On the Likert scale, 1 indicates strongly disagreeing, while 7 indicates strongly agreeing, that an item’s criteria were met. CAM average appraisal scores ranged from 2.5 to 4.9, where 15 CPGs had an average appraisal score of ≥ 3.0 and eight CPGs had an average appraisal score of ≥ 4.0. Four CPGs [64, 69, 72, 73] had a CAM average appraisal score of ≤ 3.0. CAM average overall assessments ranged from 2.5 to 5.0, with 11 CPGs having an average overall assessment of ≥ 4.0 and only one CPG [54] having an average overall assessment of ≥ 5.0.

Overall recommendations: overall guideline (Table 3)

Table 3 Overall Recommendations for Use of Appraised Guidelines

From the 19 evaluated CPGs, 10 CPGs were recommended for use by both appraisers. Of these 10 CPGs, both appraisers agreed on a rating of “Yes with Modifications” for eight CPGs [43, 53, 56, 57, 66, 68, 71, 77, 85], while appraisers gave different ratings of “Yes” and “Yes with Modifications” for two CPGs [37, 38, 54]. Additionally, both appraisers agreed on a rating of “No” for four CPGs [48, 70, 73, 74], while the remaining five CPGs had conflicting ratings of “Yes with Modifications” and “No” [42, 47, 59, 69, 83].

Overall recommendations: CAM sections (Table 3)

From the 19 evaluated CPGs, only one CPG’s CAM section was recommended for use by both appraisers [54], where both appraisers agreed on a rating of “Yes with Modifications”. Both appraisers agreed on a rating of “No” for eight CPGs [53, 56, 57, 68,69,70, 72, 73], while the remaining 10 CPGs had conflicting ratings of “Yes with Modifications” and “No” [37, 38, 42, 43, 56, 57, 59, 66, 71, 77, 83, 85].

Scaled domain percentage quality assessment (Table 4)

Table 4 Scaled Domain Percentages for Appraisers of Each Guideline

Overall scaled domain percentage scores varied across CPGs, ranging from 72.2 to 100.0% for scope and purpose, 30.6–91.7% for stakeholder involvement, 26.0–86.5% for rigour of development, 69.4–100% for clarity of presentation, 0.0–37.5% for applicability, and 0.0–100.0% for editorial independence. Average scaled domain percentages for overall CPGs, from highest to lowest, were clarity of presentation (90.3%), scope and purpose (91.5%), editorial independence (57.0%), rigour of development (54.7%), stakeholder involvement (56.7%), and applicability (14.6%).

Additionally, CAM scaled domain percentage scores varied across CPGs, ranging from 72.2 to 100.0% for scope and purpose, 8.3–94.4% for stakeholder involvement, 17.7–72.9% for rigour of development, 27.8–94.4% for clarity of presentation, 0.0–12.5% for applicability, and 0.0–100.0% for editorial independence. Average scaled domain percentages for CPGs’ CAM sections were, from highest to lowest, scope and purpose (91.5%), clarity of presentation (64.0%), editorial independence (57.0%), rigour of development (45.9%), stakeholder involvement (27.8%), and applicability (2.1%).

Scope and purpose

Overall, all CPGs scored highly in scope and purpose, effectively communicating overall objectives and health questions in relation to the treatment and/or management of IBD. Different interventions, and their potential benefits and intended outcomes (e.g., induction of remission) were extensively described. The characteristics of the target population were also easily identifiable (e.g., “patients with mild-moderate UC” [47]).

Stakeholder involvement

There was great variation in overall stakeholder involvement domain scores. All CPGs scored at least moderately well in describing overall guideline development group characteristics of members’ geographic locations, and institutional affiliations, with some CPGs scoring higher for further describing members’ disciplines (e.g., gastroenterologist, or methodologist) and/or specific roles in the group [37, 38, 43, 54, 56, 57, 59, 70, 71, 77, 83, 85]. Most CPGs clearly identified their target users, though some CPGs scored more poorly for not explicitly stating target users and not detailing how the CPG may be used [42, 59, 66, 70, 72, 73]. Some CPGs did not at all consider patients’ views and preferences in guideline development [47, 48, 59, 66, 70, 73], while others mentioned considering or emphasizing patient values but did not elaborate on what/how information was gathered [42, 43, 69, 71, 72, 83]. CPGs that scored moderately to very well additionally described methods and strategies used to capture patient values (e.g., literature review, patient advocate on guideline panel) [37, 38, 54, 56, 57, 68, 77, 85] and/or outcomes of gathered information (e.g., preference of avoiding medications’ adverse events over preventing disease recurrence) [53, 54, 56, 57, 68, 77].

In contrast, all CPGs but one [54] scored poorly for the CAM stakeholder involvement domain. CPGs’ CAM target user scores mirrored overall target user scores, with only some CPGs not explicitly stating target users and how to use the CPG [42, 59, 66, 70, 72, 73]. Other than one CPG [54], none of the CPGs involved CAM practitioners in their guideline development group nor described patient preferences regarding CAM therapies.

Rigour of development

Most CPGs thoroughly described how systematic methods were used to find evidence (including CAM evidence) [37, 38, 43, 53, 56, 57, 59, 66, 68, 71, 77], though some CPGs did not include complete search strategies [59, 66]. Lower-scoring CPGs, with regards to systematic methods, either only described databases in which searches were performed [42, 47, 83, 85] or merely stated that literature searches were conducted [54, 70]. The lowest-scoring CPGs in systematic methods provided no evidence of a systematic literature search occurring [48, 69, 72, 73]. Surprisingly, not many CPGs explicitly stated all relevant inclusion/exclusion criteria (e.g., study designs, outcomes, population) for both overall and CAM evidence [37, 38, 53, 68, 77], though the other CPGs described the relevant population and at least partially described some studies that were included [42, 43, 47, 48, 54, 56, 57, 59, 66, 69,70,71,72,73, 83, 85]. Strengths and limitations of the body of evidence (including CAM evidence) were thoroughly described by seven CPGs [37, 38, 43, 53, 56, 57, 66, 68, 71]. The remaining eight CPGs reported on most aspects of strengths and limitations but were somewhat lacking details on study biases [42, 47, 48, 59, 69, 70, 72, 83] and/or the magnitude and consistency of results for benefits and harms [42, 48, 54, 73, 77, 83, 85]. Though lacking in CAM-specific considerations, most CPGs comprehensively described methods used for overall recommendation formulation [37, 38, 42, 43, 47, 56, 57, 59, 66, 71, 77, 83]. Accordingly, most CPGs incorporated a thorough consideration of health benefits versus harms in their overall and CAM recommendation formulation [37, 38, 42, 43, 47, 53, 54, 56, 57, 59, 66, 68, 71, 72, 77, 83, 85]. All CPGs also explicitly linked their recommendations (including CAM recommendations) with supporting evidence. Many CPGs merely described that external review occurred and/or described its purpose [37, 38, 43, 53, 54, 68, 70, 71, 77, 83]. No CPGs specifically detailed methods used or information gathered from an external review, nor did any CPG describe having CAM practitioners participate in an external review. Only four CPGs provided a procedure for updating overall CPGs (including CAM sections) with a specific timeframe [37, 38, 54, 77, 85].

Clarity of presentation

Recommendations were specific and unambiguous for all of the overall CPGs and for most of the CAM sections, though a few CPGs were considerably vague regarding intent/purpose [47, 68, 69, 72, 73, 77]. All CPGs clearly presented different options for overall management of IBD. However, only five CPGs identified many CAM therapy options for IBD [48, 54, 59, 66, 85], whereas other CPGs only provided CAM recommendations against a therapy’s use [43, 56, 57, 68, 71, 77] or provided few CAM options [37, 38, 42, 47, 53, 69, 70, 72, 73, 83]. Key recommendations (both overall and CAM sections) were easily identifiable for all CPGs except one [73].

Applicability

Applicability scaled domain percentages, for both overall CPGs and CAM sections, were generally poor. Only four CPGs described overall facilitators and barriers of recommendation implementation [66, 69, 77, 85], and no CAM facilitators or barriers were discussed in any CPG. Some CPGs provided limited advice or tools supporting recommendation implementation [37, 38, 54, 70, 77, 83, 85], mostly in links to compact guideline summaries or educational resources. Certain CPGs merely made mention of considering resource implications in formulating recommendations [37, 38, 42, 43, 72, 77, 83], while other CPGs’ recommendations additionally had some discussion of resource implications (e.g., recommendation caveats based on cost, interventions’ cost-effectiveness) [53, 56, 57, 59, 66, 68, 69, 71, 85]. No CPGs’ CAM sections discussed facilitators/barriers, provided advice/tools, or considered resource implications. Many CPGs had monitoring and/or auditing criteria for some recommendations [37, 38, 56, 57, 59, 66, 68, 77, 85], though in all instances they were lacking in detail. Few CPGs had monitoring and auditing criteria for CAM recommendations [54, 66, 70, 85].

Editorial independence

For both overall CPGs and CAM sections, most CPGs identified funding sources, though only certain CPGs had explicit statements of no influence [43, 54, 56, 57, 66, 71, 83] while others lacked explicit statements [37, 38, 47, 48, 53, 59, 68, 77, 85]. Five CPGs entirely lacked funding body statements [42, 69, 70, 72, 73]. Similarly, regarding competing interests for both overall CPGs and CAM sections, most CPGs identified conflicts of interest, though only one CPG satisfactorily addressed these conflicts [54] while others did not [37, 38, 43, 47, 48, 56, 57, 60, 69, 71, 72, 77, 83, 85]. CPGs that declared no pertinent conflicts also scored well [53, 59, 66, 68], though the lowest-scoring CPGs lacked a competing interests section. [42, 73].

Discussion

The objective of this review was to determine the quantity and assess the quality of CPGs providing CAM recommendations for the treatment and/or management of IBD. There were a wide range of CAM categories covered by different CPGs, though most CPGs had only a few CAM recommendations. The quality of 19 CPGs with CAM recommendations were assessed using the 23-item AGREE II instrument (where on each item’s Likert scale, 1 indicates strongly disagreeing that an item’s criteria were met and 7 indicates strongly agreeing that an item’s criteria were met). Domain scores differed greatly between different CPGs. Regarding overall guidelines, four CPGs [37, 38, 54, 56, 57, 77] scored ≥ 5.0 (and seven CPGs [42, 47, 48, 69, 70, 72, 73] scored ≤ 4.0) in both average appraisal score and average overall assessment. Regarding guidelines’ CAM sections, no CPGs scored ≥ 5.0 (and eleven [42, 47, 48, 59, 68,69,70, 72, 73, 83, 85] CPGs scored ≤ 4.0) in both average appraisal score and average overall assessment.

Comparative literature

Although this review is, to our knowledge, the first to determine the quantity and assess the quality of CPGs providing CAM recommendations for the treatment and/or management of IBD, our findings can be compared with published reviews assessing both IBD CPGs as well as CAM recommendations in CPGs relating to other disease topics.

One study conducted a systematic review of IBD diagnosis and/or treatment CPGs, applying the AGREE II instrument and finding similar average scaled domain percentage findings: clarity of presentation (85.58%), scope and purpose (84.51%), editorial independence (62.02%), rigour of development (69.95%), stakeholder involvement (60.90%), and applicability (26.60%) [87]. The study’s authors concluded that the quality of most evaluated CPGs was acceptable, though there was room for improvement in the domains of stakeholder participation and applicability [87]. Another systematic review applied the AGREE II instrument to pharmacological therapy recommendations in IBD CPGs, though the pharmacological review differed considerably in domains of editorial independence (94.0%), applicability (45.8%), and stakeholder involvement (38.9%) [88] as compared to this present review. The pharmacological study discussed causes of heterogeneity between CPGs’ pharmacological recommendations (including varying efficacy of drugs in CD versus UC, special populations like pediatric patients, and potential developer bias in recommendation formulation), while suggesting that future guidelines could be improved through more refined recommendations based on target population characteristics (e.g., appropriate remission recommendations for severe UC adult patients may differ from moderate UC pediatric patients) [88]. A third study systematically reviewed diagnostic approaches in IBD CPGs and observed heterogeneity in diagnosis recommendations, while identifying domains of stakeholder involvement, rigour of development, and applicability as areas of improvement [89]. Finally, a systematic review examined the conflicts of interest and quality of evidence used for recommendations present in IBD CPGs, where authors noted considerable variation in recommendations’ evidence quality and numerous conflicts of interest in many CPGs [90].

This present review’s CAM section results can be compared to the findings of a systematic review that examined CPGs focused primarily on CAM therapies (e.g., herbal medicine, acupuncture, spinal manipulation) [91]. The CAM study had markedly different average scaled domain percentage findings: scope and purpose (83.3%), clarity of presentation (85.3%), editorial independence (60.1%), rigour of development (61.2%), stakeholder involvement (52.0%), and applicability (20.7%) [91]. Nonetheless, the CAM study similarly noted the paucity of high-quality CAM CPGs and variation in quality across domains [91]. Other studies assessing quality of CPGs’ CAM recommendations (using AGREE II) across various diseases/conditions (e.g., rheumatoid arthritis and osteoarthritis, colon cancer, multiple sclerosis, anxiety, depression) had similar trends in average scaled domain percentages of CAM sections, with clarity of presentation and scope and purpose domains tending to score higher, and stakeholder involvement and applicability tending to score lower [92,93,94,95,96].

Overall, this study revealed that there are few high-quality CPGs that comprehensively cover CAM therapy recommendations on IBD treatment and/or management. Of the 19 evaluated CPGs, 12 had only one or two CAM recommendations [37, 38, 42, 43, 47, 69,70,71,72,73, 77, 83, 85]. Many of these CPGs’ CAM recommendations were based on low-quality evidence [43, 68, 70, 71], or they were recommendations indicating knowledge gaps or neutral statements [47, 66]. Of seven CPGs with three or more CAM recommendations, four CPGs’ CAM recommendations consisted almost entirely of neutral/open recommendations [53, 54, 59, 69], generally due to knowledge gaps [53] or insufficient evidence [59, 69] for recommendations in favour of a given CAM therapy. For another CPG, two out of three CAM recommendations had very-low quality of evidence, and all three recommendations were against CAM therapy use [56, 57]. This study also found how the quality of CPGs’ CAM sections varied within each guideline (throughout different AGREE II domains) and between different guidelines.

There is a dearth of high-quality CAM research to support informed decision making on CAM use among healthcare professionals and patients. Challenges to CAM research include the absence of quality control and regulations on herbal supplements [10], challenges with blinding of physical interventions (e.g., acupuncture) or mind-body techniques (e.g., yoga) in study design [10], lack of funding [97], and bias against CAM research [97]. Despite these challenges, the use of CAM is highly prevalent among patients with IBD [10, 19,20,21,22,23]. Many patients with IBD also do not disclose their use of CAM to healthcare professionals, while many healthcare professionals have limited knowledge of CAM [10]. Altogether, patients’ hesitancy/inability to consult their healthcare provider on CAM therapies may negatively impact patient care and may be damaging to shared and informed decision making. A greater availability of high-quality CAM recommendations in CPGs, however, may present an opportunity for healthcare professionals to confidently provide informed advice on CAM use. Given the varying quality of CAM recommendations in CPGs, future development or updating of IBD CPGs can improve on guidelines’ CAM sections. One domain that could be improved on for future CPGs is stakeholder involvement, as most CPGs’ guideline development groups lacked CAM experts that may be knowledgeable of more therapies relevant to IBD treatment and/or management (who may help to increase the quantity of CAM recommendations). Similarly, incorporating patients’ views and preferences on CAM as part of the guideline development process can better inform healthcare professionals on shared care and decision-making principles [98]. For instance, the development of a CPG for the management of increased intestinal permeability [99] was informed by a cross-sectional survey of patient behaviours and preferences (which included questions about naturopathic practitioners and dietary supplements), which allowed the guideline to discuss the discrepancies between patients’ most commonly used dietary supplements and current evidence-based recommendations. Additionally, the same CPG had a diverse guideline development group that involved CAM experts, including naturopathic practitioners and integrative medicine practitioners, which allowed for the opportunity to consider the concordance between published evidence and clinical practice on managing increased intestinal permeability [99]. Ultimately, incorporating feedback from experts and patients may help increase uptake of the CPG among these target users [99, 100] Applicability is another domain that could be further improved upon due to the lack of tools in CPGs for clinicians to use to implement CAM recommendations into patient care plans and monitor therapy efficacy. One way to combat this would be for CPGs to include additional resources such as guides on facilitating CAM use discussions with patients, flow chart and algorithm versions of CPGs for deciding which CAM therapy is the most appropriate in a given situation, and patient versions of CPGs [101, 102]. Regarding rigour of development, given how many CPGs did not describe search strategies with many CAM terms (if described at all), developers may consider including more CAM terms in literature searches to potentially yield a greater body of CAM evidence for recommendation formulation. The AGREE II instrument can be used to identify criteria important for guideline reporting [31]. Furthermore, there exists other frameworks and checklists to help guide CPG development [103,104,105].

Strengths and limitations

One strength of this study is the use of systematic methods in identifying eligible CPGs for the treatment and/or management of IBD, though it is possible our search did not identify all relevant CPGs. Another strength is the use of the AGREE II instrument, which is widely accepted as the gold standard tool for evaluating CPGs [31,32,33]. A corresponding limitation, however, is how CPGs with CAM recommendations were evaluated by only two appraisers, rather than four appraisers as recommended by the AGREE II manual [31,32,33]. This limitation was partially addressed, however, by JYN, MCW, and HL conducting a pilot test to standardize scoring, where three different non-IBD CPGs were independently appraised before results were discussed to achieve consensus on how to apply the AGREE II instrument. Furthermore, independent appraisals of the 19 IBD CPGs with CAM recommendations by MCW and HL were followed by meetings and discussions with JYN to resolve uncertainties, while making sure to not change legitimate score discrepancies.

Conclusions

The present study identified 49 CPGs published on IBD treatment and/or management since 2011, of which 19 CPGs made recommendations on CAM therapies such as probiotics, dietary and herbal supplements, fecal microbial transplantation, and mind-body medicine. Evaluation of these 19 CPGs with the AGREE II tool revealed variable quality within and across CPGs. Most CPGs had substantially lower CAM section AGREE II scores, as compared to non-CAM treatments in overall CPGs, where only one CPG was recommended for use by both appraisers. For future IBD guideline development and updates, CPGs with lower scaled domain percentages (for both overall and CAM-specific sections) could be improved with reference to the AGREE II instrument, as well as other guideline development resources. The general low quality of IBD CPGs’ CAM sections and low quantity of CAM recommendations in most CPGs presents a barrier to informed decision-making on CAM therapies among patients and healthcare professionals. Overall, future IBD guideline development would greatly benefit from improving CPGs’ CAM sections, specifically with regard to considering patients’ views on CAM, collaborating with CAM experts, providing CAM resources for patients and clinicians, and incorporating a greater quantity and quality of CAM evidence and recommendations.

Fig. 1
figure 1

PRISMA Diagram

Fig. 2
figure 2

Summary of CAM Recommendations in Clinical Practice Guidelines