Introduction

Cryptoglandular anal fistula (AF) is a challenging condition to manage. The symptom burden can be severe and can have wide-ranging impact on physical functioning and quality of life [1]. For clinicians, the difficulties of balancing treatment efficacy with minimal impairment of continence have been well documented [2, 3], particularly for complex and recurrent cases. In an attempt to address the dichotomy in achieving these key treatment aims, numerous sphincter-preserving procedures have been developed in recent decades. These procedures have now made their way into common clinical practice, leading to wide variation in the techniques used according to surgical expertise, preference, and geographical area [4]. Along with the expansion of procedures, numerous interventional studies have been conducted to assess success rates and determine treatment superiority. Attempts have been made to meta-analyse data from multiple studies, however, difficulties in doing so reliably are frequently reported, due to inadequate follow-up, lack of randomized controlled trials, and non-uniform reporting of outcomes [4,5,6]. This limits the development of treatment guidelines for AF.

The selection of relevant and appropriate outcomes is crucial to any study on treatment effectiveness [7], however, the lack of a systematic approach results in the reporting of numerous outcomes with varied definitions, multiple measurement instruments, and inconsistencies in the timing of assessment. Furthermore, selective reporting of outcomes based on significant results is a recognised problem and can overestimate the size of the treatment effect [8, 9]. Such outcome reporting bias can lead to ill-informed decisions with the potential to cause patient harm [10].

One way of addressing such issues is to develop a core outcome set (COS); an agreed, standardised set of outcomes to be measured in all interventional studies for a specific health condition [9]. The importance and value of a COS in disease areas with heterogeneity in outcome reporting is being increasingly recognised. However, a COS has not yet been developed for cryptoglandular AF. We believe that this is an important step in addressing the challenges in developing evidence-based management strategies.

According to the Core Outcome Measurement in Effectiveness Trials (COMET) initiative, the first stage in the development of a COS is to determine what to measure, which can be partially achieved by identifying potential outcomes from the existing literature [7]. The primary aim of this systematic review was to identify all patient- and clinician-reported outcomes in studies assessing medical, surgical or combination treatment of adult patients with cryptoglandular AF, to inform the development of a cryptoglandular Anal Fistula Core Outcome Set (AFCOS) [11]. The secondary aim is to assess outcome definitions and identify the measurement instruments used.

Materials and methods

A systematic review of studies assessing medical, surgical, and combined interventions for cryptoglandular AF was performed in accordance with a registered protocol (PROSPERO-ID CRD42018102778).

Search strategy

An electronic search strategy was developed by an information specialist prior to execution. The following electronic databases were searched, adjusting vocabulary and syntax for each: Medline (Ovid), Embase (Ovid), and The Cochrane Library. Validated terms for ‘Perianal Fistula’ were used, ensuring that all interventional studies for AF could be captured. If MeSH terms or subject headings existed, these were included in the search strategy and supplemented with free-text searches of the same databases. To avoid limiting the scope of outcomes identified, no study design filter was applied. The search was restricted to full-text articles in English published from January 2008 to May 2020 and to studies conducted in human subjects aged ≥ 18 years. The full search strategy can be found in Table 1.

Table 1 Search strategy

Study selection

Four members of the study management group (AM, NI, KS, SA) identified and screened titles and abstracts using Covidence Systematic Review Software (Veritas Health Innovation, Melbourne, Australia, available at https://www.covidence.org/home), with each abstract and full-text publication screened by two independent group members. The following predefined selection criteria were used: (1) Prospective [including randomised controlled trials (RCTs), cohort comparisons, case controls and case series], retrospective, and observational studies including ≥ 10 patients and systematic reviews published between January 2008 and May 2020; (2) including ≥ 10 adult patients (aged ≥ 18 years) with cryptoglandular AF; (3) assessing medical, surgical, or combined interventions for cryptoglandular AF; (4) and reporting ≥ one outcome. Studies were excluded if they were abstract only or if they reported on interventions that were only assessed on fistulas that were not perianal or not of cryptoglandular origin. Systematic reviews were included and individual studies were checked for eligibility. Disagreements were resolved through discussion with recourse to the senior authors (PT, SB) if necessary.

Data extraction

Two members of the study management group (AM, NI) extracted data from eligible studies using a predefined data extraction sheet created in Microsoft Excel. Extracted data included study publication year, design, interventions, patients, outcomes (primary and secondary), outcome definitions and measurement instruments used. In keeping with COMET recommendations, all data were extracted verbatim [7]. The quality of describing and reporting outcomes was assessed using Harman’s criteria [12], which are presented in (Table 2). Disagreements were resolved through discussion with recourse to the senior authors (PT, SB) if necessary.

Table 2 Overview of the included studies

Data synthesis

Outcome categorisation

The resulting list of outcomes was reviewed by the study management group, including patient representatives (AM, NI, GK, RW, HG, MK, UG, PT, SB) to enable those with similar wording or meaning to be reduced to a single outcome. These were then mapped according to the COMET taxonomy developed for outcomes in medical research [13]. In this taxonomy, the measurable aspects of health conditions can be structured into five core areas, namely death, physiological or clinical, life impact, resource use, and adverse events, and further subdivided into 38 domains.

Data analysis

Primary, secondary, and overall outcome reporting were analysed. Results were summarized using frequencies and percentages. The frequency of outcome domain reporting was calculated. The interventions studied, number of outcome definitions and measurement instruments used were collated and analysed.

Results

Search strategy and study selection

The electronic databases Medline (Ovid), Embase (Ovid), and The Cochrane Library were searched in May 2018, followed by an updated search in May 2020, identifying a total of 2583 records. A schematic overview of the inclusion and exclusion of articles, including reasons provided for exclusion, is presented in Fig. 1. Full-text screening resulted in the inclusion of 143 articles, including 15 systematic reviews. The systematic reviews were individually screened for any additional studies that were not captured by the initial search and this yielded 27 articles, resulting in a final number of 155 articles from which data were extracted.

Fig. 1
figure 1

Preferred reporting items for systematic reviews and meta-analyses flow chart of study selection

Study characteristics

An overview of the 155 included studies is presented in Table 2. Interventions for cryptoglandular AF were assessed on a total of 11,819 patients (mean 76, range 10–462 participants per study). The majority of studies were prospective studies (52%) and assessed the effectiveness of sphincter-preserving procedures, of which fistula plugs (19%) and ligation of intersphincteric fistula tract (LIFT) procedures (19%) were assessed most frequently. The characteristics of the included studies are presented in Table 3. The quality of outcome reporting for each individual study was assessed using Harman’s criteria [12] and reported in Table 2. The criteria involve assessing whether: (1) The primary outcome for a study is clearly stated, (2) The primary outcome is clearly defined so that other researchers can reproduce its measurement, (3) The secondary outcomes are clearly stated, (4) The secondary outcomes are clearly defined, (5) The authors explain the use of the outcomes they have selected and (6) Any methods were used to enhance the quality of outcome measurement. The average number of criteria met across all studies was two, with only 38 of 155 studies (25%) meeting ≥ four criteria, indicating high-quality outcome reporting in just a quarter of the studies assessed.

Table 3 Study characteristics

Study outcomes

In total, 552 patient- and clinician-reported outcomes were extracted from 155 studies, with studies reporting a median of three outcomes (interquartile range 2–5) per study. Duplicate and analogous terms were merged to form 52 outcomes, of which healing (77%), incontinence (63%), recurrence (40%), and pain (26%) were reported most frequently (Table 4). Outcomes such as healing and recurrence were sometimes measured at different time points within the same study but referred to as primary or secondary outcomes. This resulted in some studies reporting outcomes of healing and recurrence more than once.

Table 4 Frequency of outcome reporting

Outcome categorisation

The outcomes were categorized into core areas and domains according to the COMET taxonomy, with guidance from a member of COMET. The frequency of these outcomes and their categorisation is shown in Table 5. Adverse event outcomes are categorised under their appropriate taxonomy and identified as a harm outcome [13]. Cryptoglandular AF treatment rarely impacts lifespan, therefore the core area death was excluded from categorisation. Some outcomes were categorised in multiple domains, as the study management group considered their impact to be broad. For instance, ‘problems related to sexual function’ was included in the domains physical, social and emotional functioning and well-being. Outcomes belonging to the core area of ‘physiological or clinical’ were placed in domains according to their underlying cause or affected body system [13]. Whilst categorisation highlighted the spread of outcomes across all relevant domains, the majority focused on the physiological or clinical impact, particularly in the domain of gastrointestinal outcomes (99%), whereas only 12% of outcomes were related to the impact on physical, role and social functioning and emotional functioning and wellbeing (Table 5).

Table 5 Outcome categorisation and frequency of outcome reporting according to the COMET taxonomy

Outcome definitions

Significant heterogeneity in outcome definition and overlap between definitions was noted in the outcomes of ‘healing’, ‘recurrence’, and ‘treatment failure’.

Healing

Healing was reported in 120 studies (77%) and was synonymous with terms such as ‘healing rate’, ‘fistula closure’, ‘success’, ‘cure’, ‘effectiveness’, and ‘complete clinical response’. There was considerable heterogeneity in the definitions of healing, however, overlap between the components of each definition meant that all could be defined by using one or more of the components presented in Table 6. Considering the ways in which components could be combined, 34 different definitions were found. Healing was most frequently defined as ‘healing of the external fistula opening and absence of symptoms’ (n = 16). In nine studies, a radiological assessment was needed to confirm or refute healing [14,15,16,17,18,19,20,21,22], whereas another study identified ‘radiological healing’ as a separate outcome [23]. Five of these 10 studies included the radiological description required to demonstrate healing [14, 15, 18, 21, 22]. In 21 studies, the definition of healing was dependent upon a time period after which the fistula should be assessed, or for the duration of which the components of healing should be present, which in themselves demonstrated significant variation, ranging from 2 weeks [24] to 12 months [16, 25] after the procedure.

Table 6 Components used, in varying combinations, to define the outcome ‘healing’

Recurrence, treatment failure and persistence

The terms recurrence, treatment failure, and persistence were used interchangeably to describe a spectrum of clinical manifestations, ranging from no evidence of closure or persistence of fistula and symptoms [26,27,28,29], to temporary closure followed by re-appearance of the original fistula [26], to the development of additional fistulas [20, 30,31,32]. Similar to healing, the definitions were broken down into components which are presented in Table 7. The most frequently used definitions were ‘persistence or recurrence of symptoms’ (n = 21), followed by ‘persistence or reappearance of the external fistula opening’ (n = 13). There were 19 different definitions of recurrence and treatment failure. In 10 studies, the definition was qualified by a time period at or after which the fistula had to be assessed, ranging from within the first month [20] to 12 months after treatment [33].

Table 7 Components used, in varying combinations, to define the outcomes ‘recurrence’ and ‘treatment failure’

Outcome measurement instruments

Heterogeneity was noted amongst the measurement instruments used for the most frequently reported outcomes (Table 8). Combinations of measurement instruments were frequently used. Furthermore, the instruments for each outcome were not always clearly stated and many studies used unspecified questionnaires.

Table 8 Measurement instruments used, in varying combinations, to assess the most frequently reported outcomes

Discussion

This systematic review is the first study to provide an overview of the outcomes reported in interventional studies for AF. We identified 552 outcomes from 155 studies published in the last 12 years, which were merged into 52 unique outcomes, of which healing was reported most frequently (77%). Our results demonstrate heterogeneity in outcome definition and measurement, making the use of such studies to supplement current understanding of fistula management and guide treatment pathways much more challenging.

The lack of consistency and clarity in definitions of success, treatment failure, and recurrence after fistula treatment has been previously noted [34]. Despite being one of the most frequently reported outcomes, healing was variably defined in terms of anatomical features, absence of a specific set of symptoms or healing of the (surgical) wound. This highlights the difficulty of data synthesis across different studies, particularly when a fistula has healed in one study simply by closure of the external fistula opening [35], but would be considered persistent in another, where both the external and internal fistula openings, and an absence of symptoms are required [36]. The addition of radiological healing provides additional complexity, as it is well documented that deep tissue healing of perianal fistula as assessed on magnetic resonance imaging lags behind clinical healing by a period of months [37,38,39]. Nevertheless, radiological outcomes and objective measures of the disease have been frequently used in studies of AF, and their potential inclusion in a COS warrants further discussion and involvement of radiological expertise.

The various definitions of recurrence, persistence, and treatment failure demonstrated overlap, however, in line with previous suggestions [34], we determined that treatment failure and persistence of the fistula, i.e. no change in the morphology and symptomatology of the original fistula, should be differentiated from fistula recurrence, which describes reappearance of the fistula after a period of resolution, and that development of new fistulas should be considered separately. However, persistence and recurrence of fistulas could simply be the same problem viewed at different time points, and from a patient’s perspective 1 year after the intervention, the difference is probably minimal. This would be an interesting area to explore during the generation of the COS.

The quality of studies eligible for data extraction was assessed using Harman’s criteria [12], however, only a quarter of the studies demonstrated high-quality outcome reporting using this method. Whilst the majority of studies clearly stated their measured outcomes, few went as far as defining whether the outcomes were primary or secondary. Only 20% of the studies explained their reasoning for selecting their outcomes. This may be due to the fact that healing, incontinence, and recurrence, the most commonly reported outcomes, require little explanation for their selection to fistula surgeons or patients, as the ultimate aim of any fistula treatment is frequently cited as healing with minimal impact on continence, and minimal risk of recurrence.

The outcomes summarised in this systematic review were categorised according to the COMET taxonomy. Although all relevant domains are represented, the vast majority of outcomes are related to the pathophysiology of disease and treatment. Only 10% of the outcomes reported by all studies in the last 12 years were related to the impact of disease in terms of its influence on patients’ physical, social and role functioning, in other words their quality of life. Whilst the inclusion of outcomes such as these is encouraging and should be recognised, their use is infrequent and gives a narrow reflection of the wide-ranging impact that fistula symptoms or treatments have for patients. For example, whilst the impact on sexual functioning has been recognised, the wider effects on personal and social relationships have not been recorded, as well as the influence of symptoms on non-work-related activities. Whilst the pathophysiological aspects of the disease are inevitably interrelated with life impact and use of resources, focusing only on the physical symptoms fails to address adequately the wider impact of living with AF. Earlier studies have identified that patients and surgeons allocate importance to different aspects of quality of life associated with anal fistula and its treatment. Surgeons rated continence, leakage, pain, cure and sepsis, whereas patients identified independent activity, good health, pain, continence, psychological health and leakage as their most important aspects of quality of life [40]. We are currently conducting further qualitative work to explore patients’ experiences of disease further, and patient involvement in deciding the final COS and how these outcomes should be prioritised is crucial to ensure that the COS remains representative of all stakeholders [7] and centred around relevance to patients.

The current study reported the range of outcome measurement instruments used for the most frequently reported outcomes. Validated measures were largely used for outcomes such as incontinence and quality of life, allowing the benefit of comparison across studies, as well as with other chronic health conditions [41]. However, the broad range of validated measures across studies for AF makes it difficult to compare these specific outcomes across interventions. This supports the need for a systematic method of selecting appropriate Outcome Measurement Instruments (OMIs) once the final COS is established [7, 42]. Furthermore, most measurement instruments of quality of life were generic. Disease-specific measures are known to be more sensitive to change and can directly detect the specific concerns of particular clinical groups, which may be underrepresented in generic measurement instruments [43]. Planned qualitative work will help to determine whether the concerns of patients with AF are adequately addressed by these instruments, or whether the development of a disease-specific Patient-Reported Outcome Measure (PROM) is needed.

The strength of this systematic review is that with the range of studies reviewed, it is well placed to inform a long list of items for the development of a COS. However, it is limited by the lack of outcomes related to the quality of life, suggesting that the additional qualitative feedback from patients required by COMET to supplement this longlist is crucial. Although it is possible that not all relevant studies have been captured due to the eligibility criteria used, the sheer number of outcomes extracted from the included studies make it likely that saturation has been reached and that any additional outcomes would be procedure specific, and, therefore, not eligible for a generic COS representing a minimum set of outcomes to be adopted by all studies, regardless of intervention used. A further limitation is the English language inclusion criterion, although no abstracts or full texts were excluded based on the language criterion alone, rather they studied the wrong population or were review articles or commentaries. The lack of non-English papers may limit the generalisability of these findings across cultural and ethnic groups. This may be effectively countered through the subsequent longlisting and consensus processes, which will include a broad ethnic and cultural diversity.

Conclusions

This systematic review highlights the need for consensus amongst researchers and clinicians regarding the outcomes that are essential in determining successful fistula treatment, and how they should be defined and measured. The underrepresentation of outcomes relating to the quality of life needs to be challenged, and qualitative exploration of the patient experience, as well as active engagement of patients in determining a COS are crucial.