Introduction

Trichotillomania (TTM, hair-pulling disorder) is a chronic psychiatric condition whose TTM diagnostic criteria (American Psychiatric Association, 2013) are recurrent pulling out of hair resulting in noticeable hair loss, severe distress and impairment in day-to-day functioning and repeated attempts to decrease or stop hair-pulling. Hair-pulling cannot be attributed to or better explained by another mental or medical condition. Trichotillomania is currently classified within the broader category of Obsessive–Compulsive and Related Disorders (APA, 2013) due to the biological (Monzani et al., 2014) and phenomenological (Lochner et al., 2005) similarities between TTM and OCD.

Trichotillomania usually starts in adolescence and follows a chronic course of waxing and waning symptom severity (Bloch, 2009). Recent metanalyses suggested that the prevalence of TTM is 1.4% in the general population; subclinical hair-pulling is also common, with prevalence rates as high as 8.84% (Thomson et al., 2022). It is plausible, however, that the actual presence of TTM in the community is even higher, since many sufferers deny pulling behaviors due to shame and low social acceptance of the disorder. Given that hair-pulling contributes to wide-ranging and long-lasting impairments in not only physical health but also several psychological domains, such as self-esteem, body image, emotions and social functioning (Diefenbach et al., 2005), the accurate assessment of TTM, including its severity and consequences, seems to be crucial.

Investigating the prevalence of TTM as well as its severity and impact on daily functioning requires psychometrically validated tools. Currently, the assessment of TTM relies primarily on clinical interview and a limited range of paper-and-pencil measurements (Diefenbach et al., 2005), including clinician and patient rating scales. Self-reports are widely used due to the numerous benefits they provide: ease of administration; ability to capture a wide range of hair-pulling characteristics (symptom severity, interference and distress); control over the time and duration of assessment (Diefenbach et al., 2005); and the possibility of multiple administration, even in large groups, which facilitates TTM research. Of these, the MGH-HPS (Keuthen et al., 1995) is the most commonly used self-report rating scale to measure hair-pulling in adults, in both everyday clinical practice and scientific research.

The MGH-HPS includes the relevant items from the Yale-Brown Obsessive–Compulsive Scale (Y-BOCS), modified to refer to trichotillomania symptoms (Keuthen et al., 1995). In particular, the Y-BOCS items that refer to obsessions were replaced with items concerning hair-pulling urges (the frequency of pulling urges; the intensity of pulling urges; the perceived ability to control hair-pulling urges). Also, the subsequent Y-BOCS items concerning compulsions were modified to rate hair-pulling frequency, attempts to resist hair-pulling, and actual control over this behavior. Additionally, to improve the internal consistency of the scale, the item related to the social impact of hair-pulling was dropped due to its poor correlation with the total MGH-HPS score and limited response variability (Keuthen et al., 1995). The final version of the MGH-HPS contains several items assessing frequency, intensity, and control of hair-pulling urges; frequency, resistance and control of hair-pulling behaviors, and distress associated with hair-pulling.

The final 7-item version of the MGH-HPS was found to have very good internal consistency (with coefficient alpha of 0.89) in the sample of chronic hair-pulling patients (Keuthen et al., 1995). Moreover, the data from an independent sample yielded acceptable test–retest reliability for use in clinical assessment, and acceptable convergent and divergent validity (O’Sullivan, et al., 1995). The MGH-HPS was also confirmed to be a useful tool for monitoring the course of this disorder due to its sensitivity to changes in symptoms (O’Sullivan, et al., 1995).

The one-factor structure of the MGH-HPS was obtained in the original validation study that was conducted on a group of 119 hair-pullers during the development of the scale (Keuthen et al., 1995), suggesting that it is a homogeneous tool that measures the severity of TTM symptoms in a clinical sample. However, in a subsequent internet-based survey (Keuthen et al., 2007), two-factors referring to hair-pulling, namely “Severity” and “Resistance and Control”, were identified in a large group (n = 990) of self-reported hair-pullers who confirmed meeting all DSM IV diagnostic criteria. In the same study, this two-factor solution was confirmed on a larger (n = 1697) mixed sample of individuals who satisfied both strict and more liberal DSM IV criteria (Keuthen et al., 2007). A more recent validation of the Persian and Turkish version of the MGH-HPS (Aydin et al., 2023; Rabiei et al., 2014) on a large clinical sample (n = 635) of individuals attending medical and psychology clinics (Rabiei et al., 2014) and patients (n = 50) who were being treated for trichotillomania (Aydin et al., 2023) yielded a unidimensional factor solution that corresponded with the structure that emerged from Keuthen et al.’s (1995) original study.

The aim of the current survey was to develop the Polish version of the MGH-HPS (Keuthen et al., 1995) and evaluate its psychometric properties in a community sample. Specifically, we examined the factor structure, reliability, convergent and divergent validity, and diagnostic accuracy of the MGH-HPS in a group of adults with self-reported clinical and subclinical hair-pulling.

Method

Participants

Determination of the Sample Size

Taking into consideration the large factor loadings found in previous studies investigating the structure of the MGH-HPS (Aydin et al., 2023), it was assumed that a minimum of 5 participants per each estimated parameter would be required to run confirmatory factor analysis, as recommended by Bentler and Chou (1987). This criterion has been applied in numerous previous studies focusing on investigation of the psychometric properties of questionnaires, conducted mainly in hard-to-reach populations (e,g. Chen & Chow, 2022; Perceau-Chambard et al., 2023; Samiefard et al., 2023). Given the 7 observed (the MGH-HPS items) and the 2 latent (hypothesized subscales) variables in the CFA model (i.e., 15 parameters to be estimated), we calculated that an absolute minimum of 75 participants who declare pulling their hair were needed to conduct such analysis. Bearing in mind the results of the only previous study that used DSM criteria to estimate the point prevalence of subclinical hair-pulling (Dubose & Spirrison, 2006, see: Turk et al., 2022 for review), we expected that about 8% of the initial responders would engage in hair-pulling behaviors, meaning that about 950 participants (0.08 × 950 = 76) needed to be screened to obtain the required subsample.

Study Sample

Potential responders were invited to participate via posts on online forums and websites that target university samples. To reach individuals with hair-pulling behaviors, we included the term “hair-pulling (trichotillomania)” in the invitation title. After clicking the link, participants were redirected to the survey, where they gave informed consent, indicated their age and gender, and were asked to answer the screening question: “Did you pull your hair in the previous week at last occasionally?”. Participants responded by choosing “Yes” or “No”. If the respondent chose the answer “No”, the form was closed and the respondent was provided with the information that they did not meet the study inclusion criteria. If the subject confirmed that they had pulled their hair at least occasionally in the past week, they were redirected to the relevant survey.

The initial study sample was composed of 1024 participants (75.9% women, 23.5% men, 0.6% non-binary) aged 18–66 (M = 24.73, SD = 8.00) who had completed the online screening survey. Within this group, the responders who pulled their hair at least occasionally in the week prior to the study were selected (n = 92, 8.98% of the initial sample): 68 women (8.75% of the screened women) and 24 men (9.96% of the screened men). We also identified a subgroup of hair-pullers (n = 23, 2.25% of the initial sample) who satisfied the full DSM-5 hair-pulling disorder diagnostic criteria and were thus likely to suffer from hair-pulling disorder of clinical relevance (Fig. 1). This subsample consisted of 3 men (0.29% of screened men) and 20 women (1.95% of screened women). The sociodemographic characteristics of the sample are presented in Table 1.

Fig. 1
figure 1

Recruitment flowchart

Table 1 Sociodemographic characteristics of the sample

Measurements

The Massachusetts General Hospital Hair Pulling Scale (MGH-HPS; Keuthen et al., 1995; O’Sillivan et al., 1995) The MGH-HPS is commonly used, self-administered scale developed to measure hair-pulling behaviors in adults. It consists of 7 items that assess (1) the frequency of hair-pulling urges; (2) the intensity of urges; (3) ability to control the urges; (4) frequency of hair-pulling; (5) attempts to resist hair-pulling; (6) control over hair-pulling; (7) distress associated with hair-pulling. Each item is rated on a 5-point scale from 0 (no symptoms) to 4 (severe symptoms), resulting in a total score of 0–28. The points are assigned a short description reflecting the severity of a particular behavior/experience during the prior week. A higher total score indicates greater TTM severity.

In the current study, we used the Polish translation of the MGH-HPS, prepared according to the standard procedure of forward and back translation. First, the two independent versions of the scale in Polish were translated by two professional translators with experience in psychological texts; these versions were then merged into the first consensus version, which was subsequently translated back into English. Then, the forward and back translations were compared to the original version to ensure that semantic equivalence between items had been achieved. All inconsistencies were discussed and resolved in a consensus meeting. The final Polish version was obtained after minor linguistic corrections. Finally, the online version of the MGH-HPS, which has the same content as the pen-paper form, was prepared to conduct the current survey.

Depression Anxiety Stress Scales 21-Item Version (DASS-21; Lovibond & Lovibond, 1995) The DASS-21 is a set of three self-administered scales designed to measure depression (dysphoria, hopelessness, devaluation of life, self-deprecation, lack of interest/involvement, anhedonia, inertia), anxiety (autonomic arousal, skeletal muscle effects, situational anxiety, and subjective experience of anxious affect) and stress (difficulty relaxing, nervous arousal, and being easily upset/agitated, irritable/over-reactive, impatient) experienced in the week prior to the study. Each of the three scales contains 7 items, rated on a 4-point scale from 0 (strongly disagree) to 3 (totally agree). The final score of each subscale is obtained by summing the scores of the items and ranges from 0 to 42. The higher the score, the more severe the emotional distress. In the current study, we used the Polish version of the DASS-21. The Cronbach’s alpha coefficients calculated for the study sample were as follows: 0.90 for the depression subscale, 0.84 for the anxiety subscale, and 0.85 for the stress subscale.

Obsessive Compulsive Inventory-Revised (OCI-R, Foa et al., 2002; Mojsa-Kaja et al., 2016) The OCI-R (short version) is an 18-item self-report scale that assesses the frequency and degree of distress associated with a broad range of obsessive and compulsive symptoms experienced during the past month: (1) washing; (2) checking/doubting; (3) obsessing; (4) mental neutralizing; (5) ordering; (6) hoarding. Each item is rated on 5-point scale ranging from 0 (not at all) to 4 (extremely), resulting in a total score of 0–72 and subscale scores from 0 to 12. The Polish translation of the OCI-R was administered in this study. The internal consistency of the individual subscales calculated for the study sample ranged from 0.49 (for the neutralizing subscale) to 0.89 (for the obsessing subscale) and was 0.89 for the total score.

Hair-Pulling Disorder Diagnostic Criteria In the present study, the participants were provided with a list of questions regarding the DSM-5 hair-pulling disorder diagnostic criteria (APA, 2013). The participants were asked whether they (1) had recurrently pulled out their hair, resulting in hair loss; (2) had made repeated attempts to decrease or stop hair-pulling; (3) had experienced significant distress or impairment in social, occupational or other important areas of functioning due to hair-pulling; (4) had been diagnosed with a medical condition or mental disorder which may underlie their hair-pulling. For most questions, the participants responded by choosing a yes/no answer; however, for the question concerning the presence of an illness and psychiatric conditions, they were provided with four answers (yes/no/probably yes/probably not).

Sociodemographic and Illness History Data Form In the current study, participants were provided with a sociodemographic data form which included questions concerning their gender, age in years, race, marital status, educational status, employment, monthly income (in PLN), and place of residence. Also, data regarding participants’ history of past and current psychiatric conditions were collected.

Procedure

The data were collected through an online survey (Microsoft Forms). The invitation to participate in the study was posted on forums and websites that target university samples. Individuals who qualified for the study (i.e., those who declared that they had pulled their hair at least occasionally in the previous week) were directed to the online survey containing the MGH-HPS, the DASS-21 and the OCI-R scales, as well as the questions concerning the DSM-5 hair-pulling criteria and the sociodemographic and illness history data sheet. A short description of the aim of the study (validation of the Polish version of the hair-pulling scale) was included in the first part of the survey form, together with the information that the survey is anonymous and participation is voluntary. Prior to completing the study, all participants were asked to consent to participation by selecting the appropriate box on the study form. No financial remuneration was offered to participants. To recruit more individuals with hair-pulling problems, we included the term “hair-pulling” in the title of the invitation to participate. The study was approved by the Ethics Committee of the Institute of Psychology, Jagiellonian University (KE_2/2021).

Statistical Analyses

Data were analyzed using IBM AMOS (version 8) and IBM SPSS (version 28) statistical software. The internal consistency of each subscale was evaluated using Cronbach’s Alpha and corrected item-total correlations. Alpha coefficients were interpreted according to the following criteria: < 0.60 = insufficient; 0.60–0.69 = marginal; 0.70–0.79 = acceptable; 0.80–0.89 = good; and 0.90 or higher = excellent (Barker et al., 1994). To verify the factor structure of the scale, a confirmatory factor analysis (CFA) was performed using the maximum likelihood (ML) estimation method. Model fit was evaluated using the following criteria: non-significant chi-square, chi-square/df ratio ≤ 2, NFI and CFI ≥ 0.95, and RMSEA value of 0.06 or lower (Cole, 1987; Hu & Bentler, 1999). The criteria for assessing the difference between the competing models was based on the scaled difference chi-square test (Satorra & Bentler, 2010). Criterion validity was assessed using Spearman’s rank-order correlations between the total score of the MGH-HPS and the score representing the number of DSM-5 criteria that were selected by the participants. Divergent validity was evaluated using Spearman’s correlations between the MGH-HPS, the DASS-21 scale and the OCI-R scale. The Mann–Whitney test was performed to explore the difference between genders in the MGH-HPS score (and subscores). Data were missing for less than 10% subjects, which was handled with listwise deletion. All the statistical analyses conducted were two-tailed with α = 0.05.

Results

Confirmatory Factor Analysis

Prior to CFA, the multivariate normality kurtosis coefficient was evaluated using a critical ratio of 5 as a threshold. Data screening indicated non-normal data, therefore we decided to apply Bollen–Stine bootstrapping as a robust solution for these violations (Yuan & Zhong, 2013, Nevitt & Hancock, 2001). In order to conclude that a given model is supported, the Bollen–Stine p value should not be significant (Blunch, 2013, p. 241–3).

First, we tested the two-factor structure proposed by Keuthen et al. (2007), where the “Severity” of hair-pulling factor consisted of four items (item 1, 2, 4, 7), and the “Resistance and Control” factor consisted of three items (item 3, 5, 6). The two latent factors were allowed to correlate freely. The fit indices did not meet the recommended cutoff values: χ2(13) = 36.04, p = 0.002, χ2/df = 2.77, NFI = 0.91, CFI = 0.94, RMSEA = 0.14, Bollen–Stine p = 0.037. Examination of the modification indices (MI) suggested that adding covariance between item 1 and item 2 would improve model fit (MI = 14.82, parameter change = 0.27). Correlated residuals could be expected since these items have very similar meaning, are placed next to each other in the questionnaire, and belong to the same subscale. Therefore, the CFA was respecified by freely estimating the error covariances of these items. The modified model, including the covariance between item 1 and 2, showed better fit to the data, as confirmed by the significant χ2diff-value (χ2diff (1) = 18.65, p < 0.001). The model fit statistics of the respecified model were as follows: χ2(12) = 17.39, p = 0.135, χ2/df = 1.45, NFI = 0.96, CFI = 0.99, RMSEA = 0.07, Bollen–Stine p = 0.368. The items showed significant and salient factor loadings, ranging from 0.59 to 0.98 (Fig. 2).

Fig. 2
figure 2

Two-factor model of the MGH-HPS. Note N = 92; MGH-HPS – The Massachusetts General Hospital Hair-pulling Scale; standardized coefficients are presented; only statistically significant paths are shown (all p < 0.001)

The two latent factors showed very strong correlation with each other (r = 0.89, p < 0.001); therefore, in the next step we tested the one-factor solution. The chi-square difference test proved to be insignificant (χ2diff (1) = 0.15, p = 0.703) and all fit indices still showed excellent fit to the data (χ2(13) = 17.54, p = 0.176, χ2/df = 1.35, NFI = 0.96, CFI = 0.99, RMSEA = 0.06, Bollen–Stine p = 0.409), which suggests that both models fit equally well statistically, and a more parsimonious, unidimensional model can be accepted as well. Factor loadings ranged from 0.60 to 0.98 (see Fig. 3 for details).

Fig. 3
figure 3

One-factor model of the MGH-HPS scale. Note N = 92; MGH-HPS – The Massachusetts General Hospital Hair-pulling Scale; standardized coefficients are presented; only statistically significant paths are shown (all p < 0.001)

Descriptive Statistics, Internal Consistency and Between-Gender Differences

Cronbach’s Alpha was 0.89 (95% CI 0.86–0.92) for the total scale, 0.85 (95% CI 0.79–0.89) for the “Severity” subscale and 0.84 (95% CI 0.77–0.89) for the “Resistance and Control” subscale. The item-rest correlations were all above 0.40, showing good internal consistency of the total scale and subscales (Table 2). Women reported higher severity of hair-pulling than men (z = 2.18, p = 0.029). Descriptive statistics and between-gender differences can be found in the supplementary materials.

Table 2 Item reliability statistics

Convergent and Divergent Validity

Individuals who pull their hair but do not meet the diagnostic criteria for hair-pulling disorder attained significantly lower MGH-HPS scores than those who confirmed meeting all DSM-5 criteria (MGH-HPS total score: z = 5.36, p < 0.001, MGH-HPS Severity: z = 5.73, p < 0,001, MGH-HPS Resistance and Control: z = 3.87, p < 0.001). The scores obtained by the two subgroups of hair-pullers on the MGH-HPS can be found in the supplementary materials.

Among the hair-pullers, there were strong, significant correlations between the score obtained in the scale based on DSM-5 criteria and each of the following: MGH-HPS total score (rs = 0.79, p < 0.001); MGH-HPS “Severity” subscore (rs = 0.80, p < 0.001); and MGH-HPS “Resistance and Control” subscore (rs = 0.65, p < 0.001). Divergent validity was demonstrated by either insignificant or weak (the MGH-HPS “Severity” and the DASS-21 stress: rs = 0.21, p = 0.05) correlations between the MGH-HPS scores and the DASS-21 and OCI-R scales, which respectively measure emotional distress and symptoms of obsessive–compulsive disorder. Detailed information about correlations between measures can be found in Table 3.

Table 3 Correlations between measures (Spearman’s rho)

Diagnostic Accuracy

To check the ability of the MGH-HPS to distinguish individuals with trichotillomania (N = 23) from individuals who pull their hair but do not meet the DSM-5 criteria for hair-pulling disorder (N = 69), the ROC analysis was utilized (Fig. 4) by assessing the area under the ROC curve (AUC). The AUC value was = 0.87 (p < 0.001), meaning that the MGH-HPS total score differentiates between individuals that meet all the DSM-5 hair-pulling criteria and indviduals who pull their hair but do not meet the DSM-5 criteria. The optimal cut-off point (13 points on the MGH-HPS) was determined using the highest Youden index, calculated as follows: (Sensitivity + Specificity) − 1. Sensitivity, specificity, and accuracy measures for the optimal cut-off point as well other scores are presented in Table 4.

Fig. 4
figure 4

The results of the ROC curve analysis. Note AUC = 0.87 (p < 0.001), indicating that the MGH-HPS total score differentiates between individuals that meet all of the DSM-5 criteria of hair-pulling disorder (N = 23) from indviduals who pull their hair but do not meet the DSM-5 criteria (N = 69)

Table 4 Results of ROC analysis and optimal cut-off point

Discussion

The aim of the study was to examine the reliability and validity of the Polish version of the Massachusetts General Hospital Hair-pulling Scale, originally developed by Keuthen et al. (1995). Also, the factor structure of the scale was assessed using confirmatory factor analyses. The results indicate that the Polish version of the MGH-HPS has acceptable psychometric properties and may be applied to measure hair-pulling symptoms in adult Polish samples.

To investigate the factor structure of the Polish version of the MGH-HPS, we firstly tested the two-factor model endorsed in Keuthen et al.’s (2007) internet-based survey. This model showed good fit to the data; however, the correlation between factors turned out to be very high (r = 0.9), suggesting that the one-factor model may describe the structure of the scale better than the two-factor model. Further analysis showed that the unidimensional, more parsimonious model fitted data very well and did not perform worse than the two-factor model. Hence, one total score reflecting the severity of actual hair-pulling should preferably be calculated for the Polish version of the MGH-HPS by summing up the seven individual item scores. This finding is in line with the results reported by Keuthen et al.’s (1995) original study on the development of the scale and with more recent studies on the Persian (Rabiei et al., 2014) and the Turkish (Aydin et al., 2023) MGH-HPS translations. Aydin et al. (2023) suggested that the differences in factor structure reported in previous studies on the MGH-HPS may be attributed to the use of different study samples. While the use of clinical samples resulted in a unidimensional structure (Aydin et al., 2023; Keuthen et al., 1995; Rabiei et al., 2014), the online survey involving participants recruited from the general population resulted in a two-factor model. However, it is worth noting that these studies actually did not formally compare the unidimensional and two-factor models: they either performed EFA to determine the number of factors (Aydin et al., 2023; Keuthen et al., 2007), or they assumed the unidimensional factor structure of the scale and performed CFA but did not test the alternative model (Rabiei et al., 2014). Our study suggests that both models may perform similarly well in samples recruited from the general population, but the strong correlation between factors suggests that the one-factor model may describe the structure of the scale more accurately.

The Polish version of the MGH-HPS revealed good internal consistency for both the one-factor model (Cronbach’s alpha coefficient as high as 0.89) and the two-dimensional structure (Cronbach’s alphas for “Severity” and “Resistance and Control” subscales 0.85 and 0.84, respectively). The reliability (internal consistency) of the unidimensional structure is similar to the Cronbach’s alpha value obtained in Keuthen et al., and and’s (1995, 2007) previous validation studies (i.e., α = 0.88 and 0.84) and to that yielded from the validation study of the Persian translation of the scale (α = 0.91; Rabiei et al., 2014); however, this reliability is lower than the value calculated for the Turkish language revised version (α = 0.96; Aydin et al., 2023). The comparison of reliability coefficients for the MGH-HPS total score may support Aydin et al.’s (2023) recent observation that replacing the term “hair” with the term “trichome/hair” (as was done for the Turkish version) may improve the reliability of the scale, since subjects who only pulled eyebrows or eyelashes may rate some of the items inaccurately. Also, the internal consistency calculated in our study for the “Severity” and “Resistance and Control” factors (0.85 and 0.84, respectively) was comparable to the reliability identified by Keuthen et al. (2007) for these factors (0.83 and 0.81).

The current study also confirms the ability of the Polish version of the MGH-HPS to distinguish clinical and non-clinical hair-pullers. The findings clearly demonstrated that individuals with subclinical symptoms scored significantly lower on the MGH-HPS than those who endorsed DSM-5 criteria for hair-pulling disorder. A significant difference was yielded not only for the overall score of the scale, but also for the “Severity” and “Resistance and Control” subscales. Moreover, the ROC analyses indicated that the total MGH-HPS score cut-off value of 13 can with 100% sensitivity and 65% specificity distinguish participants who are likely to suffer from hair-pulling disorder from non-clinical hair-pullers. It should be stressed, however, that the cut-off value obtained in our study refers to the difference between the clinical group and the group of individuals who also confirmed hair-pulling behaviors but did not meet DSM 5 criteria for hair-pulling disorder. Therefore, it is not surprising that the cut-off value is higher than the value of ≥ 9 that was adopted by Aydin et al. (2023) in a recent study using a clinical sample and a non-clinical control group.

Our findings also indicated that women and men do not differ according to the MGH-HPS total score; however, symptoms reported by women seem to be more severe, as indicated by a significantly higher Severity subscale score. This finding is in line with a recent meta-analysis (Thomson et al., 2022) which showed that although women are not more likely than men to engage in any hair-pulling behaviors, they are more inclined to report noticeable hair loss due to pulling, which may reflect greater severity of symptoms.

Due to the absence of other valid Polish-language tools for trichotillomania measurement, the convergent validity was examined by calculating the Spearman’s correlations between MGH-HPS score and the number of DSM-5 hair-pulling disorder criteria (without the question concerning the presence of a dermatological issue or mental illness which may better explain the symptoms was not taken into account). As expected, a strong, positive relationship between these measurements was yielded, which indicates that individuals who score high on the MGH-HPS are also more likely to confirm meeting the trichotillomania criteria. Divergent validity was assessed by calculating Spearman’s correlations between the MHG-HPS score and the DASS-21 score, which is a self-reported scale measuring current symptoms of depression, anxiety, and stress. In our study we did not observe a significant relationship between the MHG-HPS scores and the DASS-21 dimensions (except a weak correlation between the MGH-HPS “Severity” subscale and the DASS-21 stress subscale). This finding supports the discriminant validity of the MGH-HPS; however, it also indicates that in the study sample hair-pulling was not associated with emotional distress. This observation is inconsistent with previous studies showing a high co-occurrence of TTM, depression and anxiety (e.g., Grant, et al., 2017; Houghton et al., 2016; Keuthen et al., 2016; Woods et al., 2006); however, it is in line with a recent meta-analysis showing the relatively limited association between trichotillomania and negative affectivity (Snorrason et al., 2023). It is plausible that the lack of correlational relationship yielded in the current study may be at least partially a consequence of the use of a small, non-clinical community sample. While the high comorbidity of anxiety and mood disorders are detected predominantly in help-seeking individuals with severe psychiatric conditions (Caspi & Moffitt, 2018; Lahey et al., 2021), subclinical hair-pulling is less likely to be accompanied by depression and anxiety, given that it does not lead to noticeable distress or functional impairment. The weak, positive correlation between the severity of hair-pulling and the DASS-21 stress subscale suggests, however, that the more severe the TTM symptoms, the higher the stress experienced by individuals. The relationship between these two variables may be bidimensional, as agitation could be both a cause and effect of hair-pulling (Mansueto, et al., 1997).

Validity analyses also showed the lack of a significant relationship between the MGH-HPS and any of the OCI-R subscales. Although this finding confirms the divergent validity of the MGH-HPS, it is inconsistent with the results of previous studies showing the link between trichotillomania and obsessive–compulsive symptoms, as well as with theoretical assumptions that TTM and OCD appear to be strongly comorbid and genetically related (Houghton et al., 2016; Keuthen et al., 2014; Monzani et al., 2014). It is plausible, however, that the relationship between hair-pulling behaviors and various OCD symptoms is less visible in a sample recruited from the general population which consists mainly of non-clinical hair-pullers who may not share the genetic burden of OCD. Although additional correlational analysis conducted on a subsample of individuals who meet the DSM-5 criteria for TTM also did not show a significant relationship between the MGH-HPS and OCI-R scores (see supplementary material for details), this analysis likely did not have enough power to detect such effects due to the small sample size (n = 23).

While the current study provides evidence for the validity of the Polish version of MGH-HPS, some methodological limitations should be mentioned. Firstly, the study sample involved mainly university students with varying severity of hair-pulling behaviors. Although some of the participants met the DSM-5 criteria for hair-pulling disorder, this subgroup was small and the diagnosis was based on self-reports rather than being confirmed in a clinical interview and medical examination of hair loss. As, we were not able to examine the psychometric properties of the Polish version of the MGH-HPS in the clinical group, the results of the ROC analysis need to be treated with some caution. Secondly, the participants were recruited and assessed via the internet. Applying this method of data collection limits control over whether the questionnaires have been completed reliably and in accordance with the instructions. Also using self-reported data forms to assess history of dermatological and psychiatric illness may not reflect the actual occurrence of co-morbid illness. The sample of individuals engaging in hair-pulling behaviors that was used to evaluate the construct validity of the MGH-HPS was relatively small. The appropriate sample size for CFA is a widely debated issue and many different rules have been formulated over the years (see: Kyriazos, 2018 for review). For example, ratios of cases to free parameters of 5:1, 10:1, and 20:1 have previously been suggested as optimal for running CFA (Bentler & Chou, 1987; Bollen, 1989; Jackson, 2003; Kline, 2016; Schumacker & Lomax, 2015). Monte Carlo studies (e.g., Wolf et al., 2013) have shown that – depending on the number of factors and indicators, indicator loadings, and strength of correlation between factors—the sample sizes required to achieve minimal bias, adequate statistical power, and the solution propriety of a given model in CFA may range from as small as 30 to almost 500. Given the non-normal distribution of the data, the results of the CFA (especially these concerning the two-factor solution) should be interpreted cautiously until replicated in a larger sample. Nevertheless, it should be noted that the minimum of 5 participants per estimated parameter (Bentler & Chou, 1987) ratio suggested by the literature as being sufficient for running CFA was exceeded in this study; moreover, on the basis of the previous stimulation study, N = 90 should be sufficient to test a one-factor model with more than 6–8 indicators and factor loadings > 0.50 using CFA (Wolf et al., 2013). Further studies on the Polish version of the MGH-HPS should also examine the test–retest reliability of the tool. Additionally, the scale’s responsiveness to changes in symptoms during the course of disease and its treatment should be investigated in a further longitudinal study. Also, it is worth examining whether the pen-paper and online versions of the scale yield similar results.

The Polish version of the Massachusetts General Hospital Hairpulling Scale (MGH-HPS; Keuthen et al., 1995) is a reliable and easily administered tool which may be used to measure hair-pulling severity, as well as resistance to and control over hair-pulling behaviors in adults with TTM symptoms. Although further studies on the scale’s stability and reliability in larger groups and clinical groups are still needed, the scale may be useful in the diagnosis of TTM and treatment planning, especially as there are no other validated tools for measuring TTM in the Polish language. Hopefully, an additional language version of the scale will also facilitate cross-cultural and epidemiological research on TTM.