FormalPara Key Summary Points

The prevalence of hidradenitis suppurativa (HS) varies throughout the literature. This study applied two stepwise algorithms to estimate possible/diagnosed cases of HS in the US from four databases.

Based on international classification of disease (ICD) codes, the 5-year period prevalence of HS ranged from 0.06% to 0.12% across all databases, while based on the two algorithms, 5-year period prevalence ranged from 0.27% to 0.78%.

In both algorithms and across all databases, adults and females generally had a higher 5-year period prevalence versus pediatric patients and males, respectively.

These data highlight that HS diagnosis codes alone may not be sufficient to estimate the true prevalence and burden of HS. Integrating algorithms into prevalence assessments may provide more accurate prevalence estimates; it is expected that the true prevalence of HS is likely somewhere between the prevalence estimated from the algorithms and the prevalence revealed based on the HS ICD codes.

Introduction

Hidradenitis suppurativa (HS) is a chronic, recurrent, painful, inflammatory skin disease that is associated with a high disease burden and multiple systemic comorbidities [1,2,3,4]. Patients with HS experience a substantial delay in diagnosis of approximately 7–10 years [3, 5, 6]. In parallel with a delayed diagnosis, patients may also experience a delay in timely treatment, which can lead to disease progression [7, 8]. Due to the spectrum of disease severity of HS, current treatment guidelines recommend an escalating order of combined medical and surgical treatments [9,10,11].

The global and regional prevalence of HS is unclear as various prevalences have been reported, ranging from 0.05% to 4.1%, representing a >80-fold difference [12,13,14,15]. As HS is frequently undiagnosed and misdiagnosed [6, 16, 17], many HS cases may be missed, and thus the prevalence may be underrepresented. To address the discrepancies between the prevalence estimates and to identify potentially undiagnosed cases of HS, Ingram et al. developed and validated an algorithm using the Clinical Practice Research Datalink (CPRD), which is linked with the Hospital Episode Statistics (HES) database, in the UK [18]. In this analysis, physician-diagnosed cases of HS in the CPRD-HES were identified from specific read codes (M25y100 and M25y111) or the International Classification of Diseases Tenth revision Clinical Modification (ICD-10-CM) code L73.2 [18]. In addition, read code algorithms were created to capture potentially undiagnosed cases of HS by identifying patients attending primary care for multiple skin boil consultations (≥ 5 read codes for boils in flexural sites), and a subsequent algorithm hierarchy was created; treatments for skin boils were also captured as potential undiagnosed cases (i.e., short course of skin-directed antibiotics in the absence of any other skin antibiotic indication) [18].

The results of the study by Ingram et al. demonstrated that one-third of identified cases were undiagnosed, highlighting the potential unmet need for treatment [18]. Herein, we report the application of an adapted algorithm described in Ingram et al. [18] and a modified version of this algorithm to estimate the prevalence of possible/diagnosed cases of HS in the US.

Methods

Study Design and Data Sources

This was a descriptive, non-interventional, retrospective cohort study of adult and pediatric patients with HS in the US. This study utilized secondary data sources from four databases: Truven Health’s MarketScan (including Medicare and Medicaid) and Optum’s electronic health record (EHR) and Clinformatics Data Mart (CDM) databases. All data included in this article were obtained from claims and electronic health record databases; no patient-level data were collected or stored.

Truven Health Analytics’ MarketScan database provides convenience sample data from approximately 200 million covered persons in the US enrolled in health insurance plans and includes person-specific clinical utilization, costs, and enrollment across inpatient, outpatient, prescription drug, and carve-out services. MarketScan Medicare consists mainly of patients aged ≥ 65 years, or patients with a disability, while MarketScan Medicaid consists mainly of patients with low incomes. Optum’s EHR database contains de-identified and aggregated clinical and medical administrative data from 85 US health care delivery organizations across 50 states and includes > 140,000 providers at 740 hospitals across > 7000 clinics. Currently, the database contains records for 97 million patients with ≥ 10 million patients in each of the main geographical regions in the US. Optum’s CDM database provides data relating to 57.6 million covered persons and includes enrollment data, physician and facility claims data, pharmacy claims data, and laboratory results from affiliated commercial and Medicare Advantage members.

The study period was from 1 January 2013 to 31 December 2019, and the identification period was from 1 January 2014 to 31 December 2018. The index date was defined as the date of the first claim related to the algorithms, or, for the second and third secondary objectives, the first claim of a diagnosis of HS during the identification period. The pre-index period was 365 days before the index date, excluding the index date, and the post-index period was 365 days after the index date, with the index date included.

Study Objectives

The primary objective of this study was to apply an adapted version of the algorithm developed by Ingram et al. [18] (Algorithm 1) and a US-modified algorithm (Algorithm 2) to identify the number of patients with  possible/diagnosed HS. Secondary objectives of this study included assessing (1) the proportion of patients with possible/diagnosed HS with actual HS diagnosis codes in the respective data sources and those identified with the algorithms only; (2) the proportion of patients with HS diagnosis codes who could be identified with the algorithms; (3) period prevalence estimates (5 year) of patients with diagnosed HS and with possible/diagnosed HS overall and based on demographic characteristics.

Patients

Patients with possible/diagnosed HS were identified using Algorithm 1 and Algorithm 2 (Table 1) and codes including ICD-9, ICD-10, Current Procedural Terminology (CPT), Healthcare Common Procedure Coding System (HCPCS), and National Drug Code (NDC). Patients with diagnosed HS were defined as having ≥ 2 outpatient or ≥ 1 inpatient diagnosis codes of HS (ICD-9-CM code: 705.83; ICD-10-CM code: L73.2) during the identification period.

Table 1 Description of Algorithm 1 and Algorithm 2 and associated sub-algorithms

Eligibility Criteria

Patients with continuous medical and pharmacy benefits in the 365 days pre-index period (allowing a gap of ≤ 45 days) and patients with continuous medical and pharmacy benefits in the 0–365 days post-index period (allowing a gap of ≤ 45 days) were eligible for inclusion in the study.

Algorithm Development

The algorithm described in Ingram et al. was developed so that it was applicable to both UK healthcare databases and the most stringent consensus disease definition for HS in the UK [18]. Due to differences in treatment patterns for HS between the UK and the US, Algorithm 1 was adapted to be more applicable to the US, and Algorithm 2 was developed to provide a more US-specific algorithm (Table 1). A prominent feature of Algorithm 2 was that HS symptoms were limited to boils, furuncles, etc., in specific site areas, as patients may have abscesses or inflamed cysts that are unrelated to HS. Algorithm 2 built on the learnings from Algorithm 1, omitting categories with lower conversion rates from potential HS cases to validated cases, in the context that it was not possible to perform the validation step for the US insurance data sources. Hence, Algorithm 2 is deliberately more restrictive in the sub-algorithm items included, aiming to provide a robust, conservative estimate of HS prevalence.

The iterative modification of Algorithm 1 and the development of Algorithm 2 are detailed in Table S1 (Supplementary Material), which includes the changes made and the rationale for the associated changes or adaptations.

Data and Statistical Analyses

Overall, no direct comparisons were conducted as part of this study, and all analyses were descriptive only. Missing values were considered a separate category within each variable. All data were anonymized prior to the transfer to Novartis’ in-house data42 platform. All analyses were performed by Novartis real-world evidence data analysts on the data42 platform. For the primary objective, the sum of the unique patient counts was obtained by applying the different sub-algorithms based on the diagnosis, procedures, and treatments to retrieve the number of patients with possible/diagnosed HS. Patients included in one sub-algorithm were excluded while considering subsequent sub-algorithms to avoid duplication of records. Due to the nature of HS diagnosis, it is more likely that lesions were documented in visits prior to the one that led to a HS diagnosis rather than afterward. As such, all observations from the identification period were considered when using the algorithms to identify possible HS in patients who were diagnosed with HS. For period prevalence, the numerator was all new and pre-existing patients diagnosed with HS or patients with possible/diagnosed HS during the time period, and the denominator was the insured population in each respective database during the time period.

Sensitivity Analyses

Sensitivity Analysis 1

In Ingram et al. [18], potential HS cases were validated by circulating questionnaires to primary care physicians for a subset of patients, and a resultant conversion factor was produced to calculate extrapolated cases of HS which were criteria-diagnosed cases. The same conversion factors were applied in this analysis to Algorithm 1 as a sensitivity analysis to address the likely overestimation of HS prevalence identified with this algorithm.

Sensitivity Analysis 2

A second sensitivity analysis was performed to calculate the incidence of patients with diagnosed HS as well as patients with possible/diagnosed HS in 2020 to understand the impact of the coronavirus 2019 (COVID-19) pandemic on health care-seeking behavior versus previous years. This sensitivity analysis was conducted using the Optum EHR and Optum CDM databases only because Novartis no longer had access to MarketScan databases when this sensitivity analysis was conducted.

Sensitivity Analysis 3

A third sensitivity analysis was conducted to assess HS prevalence based on the selection of patients with ≥ 1 outpatient or ≥ 1 inpatient diagnosis code of HS during the identified period. This identification method is used in some cohort studies and has been previously validated [19]. This sensitivity analysis was conducted using the Optum EHR and Optum CDM databases only because Novartis no longer had access to MarketScan databases when this sensitivity analysis was conducted.

Results

Primary Objective: Number of Patients with Possible/Diagnosed HS

The number of patients with possible/diagnosed HS identified with Algorithm 1 and Algorithm 2 from all databases is shown in Table 2. Across all databases, Algorithm 2 (MarketScan Medicare [N = 309,916]; MarketScan Medicaid [N = 188,783]; Optum EHR [N = 366,158]; Optum CDM [N = 173,812]) identified more patients with possible/diagnosed HS versus Algorithm 1 (MarketScan Medicare [N = 194,353]; MarketScan Medicaid [N = 99,276]; Optum EHR [N = 177,957]; Optum CDM [N = 112,244]).

Table 2 Number and proportion of patients with possible/diagnosed HS identified using Algorithm 1 and Algorithm 2 by database

In Algorithm 1, almost all patients were identified based on diagnosis codes for boils, furuncles, carbuncles, and/or abscesses alone (i.e., sub-algorithms 1a–c and 2a–b) (MarketScan Medicare [96.8%, 188,079/194,353]; MarketScan Medicaid [96.3%, 95,570/99,276]; Optum EHR [97.5%, 173,503/177,957]; Optum CDM [97.6%, 109,577/112,244]). In Algorithm 2, most patients were identified based on sub-algorithm 1a,  ≥ 2 diagnoses of boils, furuncle, carbuncle or abscess of axilla, groin, perineum, or buttocks within a 6-month period (MarketScan Medicare [87.8%, 272,023/309,916]; MarketScan Medicaid [90.9%, 171,640/188,783]; Optum EHR [97.2%, 355,738/366,158]; Optum CDM [89.7%, 155,969/173,812]).

Secondary Objectives

Attrition of Patients with Diagnosed HS Based on ICD Codes

The attrition of patients with diagnosed HS based on ICD-9/10 codes from each database are detailed in Fig. 1. For all databases, there was an expected difference between the number of patients included based on patients with ≥ 2 outpatient or ≥ 1 inpatient diagnoses (MarketScan Medicare [N = 48,582]; MarketScan Medicaid [N = 26,384]; Optum EHR [N = 40,953]; Optum CDM [N = 24,012]) versus patients with continuous medical and pharmacy benefits in the 0–365 days post-index period (MarketScan Medicare [N = 23,463]; MarketScan Medicaid [N = 14,720]; Optum EHR [N = 31,793]; Optum CDM [N = 9902]).

Fig. 1
figure 1

Attrition of patients with diagnosed HS included in the analysis from all databases based on stepwise inclusion criteria. Flow diagram detailing the attrition of patients with diagnosed HS based on ICD-9/10 codes included in the A MarketScan (Medicare), B MarketScan (Medicaid), C Optum EHR, and D Optum CDM databases. CDM clinformatics data mart, EHR electronic health record, HS hidradenitis suppurativa, ICD-9/10 International Classification of Diseases version 9/10, N number of patients

Patients with Possible/Diagnosed HS and HS Diagnosis Codes

The proportion of patients identified/not identified as having HS using HS diagnosis codes and algorithms is detailed in Table S2 (Supplementary Material).

In Algorithm 1, the proportion of patients with possible/diagnosed HS among patients with actual HS diagnosis codes was 20.4% for MarketScan (Medicare), 27.4% for MarketScan (Medicaid), 25.0% for Optum EHR, and 19.5% for Optum CDM. In Algorithm 2, the proportion of patients with possible/diagnosed HS with an actual HS diagnosis code was 19.5% for MarketScan (Medicare), 31.0% for MarketScan (Medicaid), 26.1% for Optum EHR, and 20.7% for Optum CDM.

In Algorithm 1, the proportion of patients with HS diagnosis codes among patients identified using the algorithm was 5.0% for MarketScan (Medicare), 7.6% for MarketScan (Medicaid), 5.8% for Optum EHR, and 3.4% for Optum CDM. In Algorithm 2, the proportion of patients with HS diagnosis codes among patients identified using the algorithm was 3.0% for MarketScan (Medicare), 4.6% for MarketScan (Medicaid), 3.3% for Optum EHR, and 2.3% for Optum CDM.

Period Prevalence

The 5-year period prevalence estimates for patients with diagnosed HS using ICD codes were 0.08% for MarketScan (Medicare), 0.12% for MarketScan (Medicaid), 0.06% for Optum EHR, and 0.06% for Optum CDM (Table 3). Adult patients with HS had the highest 5-year period prevalence estimates across all databases (MarketScan Medicare [0.09%]; MarketScan Medicaid [0.20%]; Optum EHR [0.07%]; Optum CDM [0.06%]), and females had higher 5-year period prevalence estimates versus males across all databases (MarketScan Medicare [0.11% vs. 0.04%]; MarketScan Medicaid [0.17% vs. 0.05%]; Optum EHR [0.09% vs. 0.03%]; Optum CDM [0.08% vs. 0.03%]).

Table 3 Five-year period prevalence of patients with diagnosed HS in the US population using ICD codes by database overall and based on demographic characteristics

The 5-year period prevalence estimates for patients with possible/diagnosed HS identified using Algorithm 1 was 0.31% for MarketScan (Medicare), 0.41% for MarketScan (Medicaid), 0.27% for Optum EHR, and 0.32% for Optum CDM (Table 4). In Algorithm 1, the 5-year period prevalence estimates were highest in adults across all databases (MarketScan Medicare [0.36%]; MarketScan Medicaid [0.65%]; Optum EHR [0.30%]; Optum CDM [0.36%]), and females had higher 5-year period prevalence estimates than males across all databases (MarketScan Medicare [0.34% vs. 0.29%]; MarketScan Medicaid [0.49% vs. 0.32%]; Optum EHR [0.29% vs. 0.24%]; Optum CDM [0.33% vs. 0.30%]). The 5-year period prevalence estimates for patients with possible/diagnosed HS identified using Algorithm 2 were 0.50% for MarketScan (Medicare), 0.78% for MarketScan (Medicaid), 0.49% for Optum EHR, and 0.49% for Optum CDM (Table 4). In Algorithm 2, the 5-year period prevalence estimates were highest in adults across all databases (MarketScan Medicare [0.59%]; MarketScan Medicaid [1.23%]; Optum EHR [0.56%]; Optum CDM [0.57%]), and females had higher 5-year period prevalence estimates than males across all databases (MarketScan Medicare [0.54% vs. 0.45%]; MarketScan Medicaid [0.92% vs. 0.59%]; Optum EHR [0.53% vs. 0.45%]; Optum CDM [0.52% vs. 0.47%]), which was more pronounced than in Algorithm 1.

Table 4 Five-year period prevalence of patients with possible/diagnosed HS by algorithm and database overall and based on demographic characteristics

Furthermore, the 5-year period prevalence of patients with an actual HS diagnosis plus patients identified from the algorithms minus those identified with the algorithms as having an actual diagnosis code in the Optum EHR and Optum CDM databases is detailed in Table 5. The 5-year period prevalence estimates using Algorithm 1 were 0.31% for Optum EHR and 0.36% for Optum CDM. Similarly, the 5-year period prevalence estimates were highest in adults in both databases (Optum EHR [0.36%]; Optum CDM [0.41%]), and females had higher 5-year period prevalence estimates than males in both databases (Optum EHR [0.36% vs. 0.26%]; Optum CDM [0.40% vs. 0.33%]). The 5-year period prevalence estimates using Algorithm 2 were 0.54% for Optum EHR and 0.54% for Optum CDM. Similarly, the 5-year period prevalence estimates were highest in adults in both databases (Optum EHR [0.62%]; Optum CDM [0.62%]), and females had higher 5-year period prevalence estimates than males in both databases (Optum EHR [0.59% vs. 0.47%]; Optum CDM [0.58% vs. 0.49%]).

Table 5 Five-year period prevalence of patients with an actual HS diagnosis plus patients identified from the algorithms minus those identified with the algorithms as having an actual diagnosis code

Sensitivity Analyses

Sensitivity Analysis 1

In Ingram et al. [18], prior to validation, cases were referred to as “proxy cases.” Following the application of the conversion factor, cases were referred to as “criteria-diagnosed cases” (validated with ≥ 5 flexural boils) and “additional probable cases” (validated with 1–4 flexural boils). Applying this principle to the current study, total proxy cases were 194,353 for MarketScan (Medicare), 99,422 for MarketScan (Medicaid), 177,957 for Optum EHR, and 112,244 for Optum CDM. Criteria-diagnosed cases were 77,037 for MarketScan (Medicare), 47,339 for MarketScan (Medicaid), 82,058 for Optum EHR, and 48,334 for Optum CDM. Additional probable cases were 41,947 for MarketScan (Medicare), 16,466 for MarketScan (Medicaid), 32,372 for Optum EHR, and 22,781 for Optum CDM. The 5-year period prevalence based on extrapolated cases was 0.23% for MarketScan (Medicare), 0.58% for MarketScan (Medicaid), 0.16% for Optum EHR, and 0.27% for Optum CDM (Figure S1; Supplementary Material).

Sensitivity Analysis 2

The incidence of HS in patients with a confirmed diagnosis in 2020 was 0.004% in the Optum EHR database and 0.007% in the Optum CDM database (Figure S2; Supplementary Material); these incidences were lower than in 2014–2018 (Optum EHR incidence range: 0.010%‒0.013%; Optum CDM incidence range: 0.012%‒0.013%). Similar results were observed using both Algorithm 1 and Algorithm 2 (Figure S2; Supplementary Material). In Algorithm 1, the incidence of patients with possible/diagnosed HS in 2020 was 0.014% in the Optum EHR database and 0.039% in the Optum CDM database, which was lower than those in 2014–2018 (Optum EHR incidence range: 0.038%‒0.078%; Optum CDM incidence range: 0.054%‒0.108%). In Algorithm 2, the incidence of patients with possible/diagnosed HS in 2020 was 0.023% in the Optum EHR database and 0.052% in the Optum CDM database, which was lower than those in 2014–2018 (Optum EHR incidence range: 0.081%‒0.150%; Optum CDM incidence range: 0.084%‒0.201%).

Sensitivity Analysis 3

A total of 86,700 patients in the Optum EHR database and 58,132 patients in the Optum CDM database had ≥ 1 inpatient or ≥ 1 outpatient diagnosis of HS during the identification period. The 5-year period prevalence estimate of patients with diagnosed HS who had ≥ 1 inpatient or ≥ 1 outpatient diagnosis of HS was 0.13% in both the Optum EHR and Optum CDM databases.

Discussion

The global and regional prevalence of HS is unclear in the published literature to date. Studies report widely varying prevalence; thus, the reported prevalence of HS may be underestimated [13, 20,21,22]. It is well recognized that there is a substantial diagnostic delay in HS, which is reported to be appoximately 7–10 years [3, 5, 6, 16]. Patients with HS are reported to visit ≥ 3 different physicians and undergo ≥ 3 misdiagnoses before receiving a correct diagnosis of HS [6, 16, 17], owing to under-recognition and lack of awareness of HS among different physician specialties [3, 23, 24]. The importance of a timely diagnosis of HS is directly linked to the implementation of timely treatment interventions, the absence of which can have a devastating impact on the disease course of HS [7, 25, 26]. Ingram et al. previously developed and validated an algorithm from the CPRD-HES in the UK to identify potential undiagnosed cases of HS and reported that up to one-third of patients were undiagnosed [18]. The current analysis assessed the applicability of using two algorithms, a modification of the algorithm described in Ingram et al. [18] and a new algorithm adapted for the US, to identify potentially undiagnosed patients with HS from four US databases.

In the current analysis, across all databases, both algorithms identified a higher 5-year period prevalence of patients with HS (Algorithm 1: 0.27%–0.41%; Algorithm 2: 0.49%–0.78%) versus patients identified using ICD codes alone (0.06%–0.12%), highlighting that many patients in the US in these databases are likely to remain undiagnosed. The estimates based on ICD codes (0.06%‒0.12%) are close to the figure of 0.10% that has been reported in previous epidemiological studies in the US [21]. The estimates based on Algorithm 1 are 3.4‒5.3 times higher than those based on ICD codes, while the estimates based on Algorithm 2 are 6.3‒8.2 times higher than those based on ICD codes. This finding suggests that US insurance data sources may underestimate the true prevalence of HS because of patients being undiagnosed/un-coded to a larger extent than the UK primary care data source where algorithm estimates (1.19%) were 1.5 times higher than coding alone (0.77%) [18]. As such, the true prevalence of HS in the US and UK, including diagnosed and undiagnosed patients, may be relatively similar.

As expected, adult patients generally had the highest 5-year prevalence estimates across all databases versus adolescent and pediatric patients for both diagnosed cases of HS and possible/diagnosed cases of HS based on the algorithms. Similarly, female patients had a higher prevalence versus males for all databases. Among diagnosed patients with HS, the prevalence of females with HS was 2.7‒3.4 times higher than that of males and 1.1‒1.6 times higher in patients with possible/diagnosed HS based on the algorithms. The observation of adult patients and females having a higher prevalence has been observed in previous US and European population studies [20, 21, 27,28,29,30,31]. Furthermore, Black patients generally had the highest 5-year prevalence estimates of all identified races, which is similar to other US studies [20, 21] and reinforces that, in the US, HS disproportionally affects African American patients [32,33,34].

Although the use of the algorithms applied in this analysis may help capture patients with HS who are undiagnosed, it is expected that the algorithms likely overestimate the true prevalence of HS and that the true prevalence of HS is likely somewhere between the prevalence estimated from the algorithms and the prevalence revealed based on the HS ICD codes. This is reinforced by the first sensitivity analysis that used a conversion factor to better estimate the prevalence of patients with possible/diagnosed HS, which ranged from 0.16% to 0.58% across all databases in Algorithm 1 (versus 0.27%‒0.41% without the conversion factors). However, caution should be advised when applying these conversion factors as they were created based on a UK population and may not be directly applicable to a US population [18].

Interestingly, there was generally a decrease in incidence in patients with diagnosed HS as well as patients with possible/diagnosed HS between 2014 and 2018, and the incidence in 2020 was lower than in previous years. This finding is not substantiated by the literature which suggests an increasing prevalence of HS, possibly due to an increased awareness of the disease [27, 28, 31]. External factors influencing health care-seeking behaviors such as changes in insurance premiums may have impacted this finding. The advent of the COVID-19 pandemic in 2020 likely contributed to the substantial decrease in observed incidence, as patients have been reported to have limited health care-seeking behaviors during this time [35]. As aforementioned, delaying a diagnosis of HS and subsequent treatment can have an impact on the disease course [7, 25, 26], and health-care practitioners should be vigilant regarding this finding to ensure that the momentum in increased awareness and diagnosis of HS is maintained to improve treatment outcomes in this population.

Limitations

The algorithms used in this analysis have not been validated, and as such, the results should be interpreted with caution. While the patient population is reasonably representative of the US population, considering the limitations of the secondary data available in the different datasets and potential limitations of the algorithms, it is difficult to synthesize a single value for 5- or 1-year prevalence. An inherent limitation of these databases is that the data recorded within the databases under consideration may be inaccurate or subject to human or technical error and may also include issues relating to missing, invalid, not recorded, or unknown data. Furthermore, the presence of duplicate records for patients within the databases may lead to certain limitations within the present analysis. Some variables are obtained from the physicians’ clinical impression of the patient and not from objective measures, which may lead to biases. There may be differences in various datasets used which led to differences between results across databases such as the definition of continuous enrollment or continuous medical and pharmacy benefits. While the databases have dedicated tables towards recording continuous enrollment, the Optum EHR database lacks this. To overcome the issue, continuous enrollment was verified using the definition of the presence of any activity during the pre- and post-index period.

Conclusion

This real-world, US-based analysis provides valuable insights into the use of two algorithms to identify patients with possible/diagnosed HS from four different health care databases. The employed algorithms identified a higher prevalence of patients with HS versus HS ICD diagnosis codes, highlighting that the use of diagnosis codes alone may underestimate the prevalence of HS by varying margins and showcases the usefulness of implementing algorithms which assess for parameters such as multiple skin boils in site-specific areas to identify undiagnosed patients. Despite this, it is expected that the true prevalence of HS is likely somewhere between the prevalence estimated from the algorithms and the prevalence revealed based on the HS ICD codes. As reported in previous studies conducted in the US, adults and female patients had a higher prevalence of HS versus adolescent and pediatric patients and male patients, respectively. By implementing such algorithms into clinical practice, patients may benefit from an improvement in HS identification and diagnosis and thus mitigate the burdensome diagnostic delay associated with the disease and improve treatment outcomes.