Predictive gender and education bias in Kessler's psychological distress Scale (k10)
- First Online:
- Cite this article as:
- Baillie, A.J. Soc Psychiat Epidemiol (2005) 40: 743. doi:10.1007/s00127-005-0935-9
- 384 Views
Kessler's Psychological Distress Scale (K10) is a ten-item measure of psychological distress that has been used in recent epidemiological research and as a screen for mental disorders. Moderate relationships have been reported between the K10 and measures of related constructs, such as diagnoses of mental disorders and associated disability. However, it is unclear whether the validity of the K10 is consistent across important demographic, cultural, and socio-economic groups such as gender and educational history or whether there is evidence of predictive bias or inconsistency across these groups.
Differential validity or predictive bias in the relationship between K10 scores and disability days, SF12 Mental Component Summary (MCS) scores, and 1-month Composite International Diagnostic Interview (CIDI) diagnoses of the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) Anxiety and Depressive disorders due to gender and completing secondary school were examined using hierarchical linear and logistic regression analyses in the Australian National Survey of Mental Health and Wellbeing data set.
Very small slope and/or intercept biases in the relationship between the K10 and disability days, the SF12 MCS, and 1-month CIDI diagnoses of anxiety and depression were found [effect sizes, the ratio of variance explained to unexplained variance (Cohen's f2), varied from 0.0001 to 0.004].
Gender and educational predictive biases in the relationship between the K10 and disability days, SF12 MCS, and 1-month diagnoses were found to be very small and are unlikely to have any practical impact. This analysis adds to evidence supporting the use of the K10 in epidemiological research.
Key wordsK10screeningmental disordergendereducationbias
Does gender and level of education influence the relationship between scores on Kessler's brief measure of psychological distress, the K10 , and related constructs such as disability, general mental health status, and mental disorder? The K10 is a well-developed measure of psychological distress that is becoming increasingly popular in psychiatric epidemiology , 9, 14]. Its main strength is a superior ability to screen for anxiety and affective disorders.
Kessler et al.  outlined how methods from item response theory were used to select questions with optimal sensitivity for assessing psychological distress around the range where diagnostic decisions are likely to be made (the 10% of the population with the highest distress). Questions were also selected if they performed consistently across socio-demographic groups. That is, that the relationship between each K10 item and the latent trait that underlies all the ten items is consistent across important socio-demographic groups. These methods minimise biases due to socio-demographics.
It is important to follow up these development methods with research to evaluate whether the K10 is equally valid across important socio-demographic groups. The K10 derives validity from its relationships with other measures of related constructs. Andrews and Slade  report moderate rank correlations with the General Health Questionnaire (ρ=0.5), the 12-Item Short Form from the Medical Outcomes Study (ρ=−0.6), and the number of consultations for a mental health problem in the past year (ρ=0.3). They and Furukawa et al.  also report on the ability of the K10 to act as a screening test for anxiety and depressive disorders in community samples. As these are all measures of constructs related to psychological distress, the strength of the relationships reinforces the construct validity of the K10. Whether these relationships are consistent across socio-demographic groups is not known. For example, is the K10 an equally good screening test for diagnoses of mental disorders in different socio-demographic groups?
Methods for the examination of predictive bias from psychometrics can answer these questions. Predictive bias refers to the relationship between one measure (the test) that is said to predict another (the criterion). This relationship can be represented by a regression equation y=mx+b, where y is the score on the criterion, x is the score on the test, m is the slope or regression coefficient, and b is the intercept or constant. ‘Under one broadly accepted definition, no bias exists if the regression equations relating the test and the criterion are indistinguishable for the groups in question’ (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, p. 79 ). Cleary et al.  first proposed a method of evaluating predictive bias by examining regression equations for consistency across socio-cultural groups. These equations can differ between groups in slope or unstandardised regression coefficients (termed slope bias), in intercept or constant (termed intercept bias), or in both. The statistical significance of differences between groups can be tested by examining the significance of regression coefficients reflecting the interaction between the test and the group in question. If these interaction terms are statistically significant then the groups show different relationships between the test and the criterion and this is defined to indicate predictive bias.
The language used to explain predictive bias may imply that it only applies to situations wherein a test is compared with a criterion. Because the relationship between a test and a criterion is one form of validity, this definition of predictive bias and associated methods are also applicable to examining the differential validity of a test between groups.
Gender differences in psychopathology are pervasive so it is important to verify that purported gender differences are an effect of gender and not an artefact of biased measurement. Education and literacy may impact on responses to self-report questions. For these reasons, predictive biases due to gender and education are examined.
The aim of this paper is to examine evidence for predictive gender or education bias in the relationship of the K10 with measures of (1) disability, as assessed by number of disability days in past month; (2) general mental health status, as assessed by the Mental Component Summary score of the 12-item Short Form of the Medical Outcomes Study questionnaire (SF12 MCS); and (3) current mental disorders, as assessed by 1-month mental disorders from the World Health Organization Composite International Diagnostic Interview (WHO CIDI).
These three criterion measures were chosen because of the K10's increasing use as a screen for current mental disorders, and because some of the K10's validity is derived from relationships with general mental health status and disability (following Andrews and Slade ).
The Australian National Survey of Mental Health and Wellbeing (NSMHWB) is described in detail elsewhere . A stratified multistage area sample of 13,624 private dwellings across Australia was selected. Households were sent a description of the survey by post a week before an interviewer called. The interviewer collected demographic information about all residents of the household, an occupant over the age of 18 with the next birthday was selected for interview, and 10,641 people or 78.1% gave verbal consent to be interviewed. Non-response was composed of 1,477 refusals (10.8%), 558 no contacts (4.1%), and 948 (7.0%) for other reasons described as language problems, death, or illness in the household, respondent away for the entire survey period, and interview terminated. No further comparative data for non-responders are available .
Of the 10,641 people interviewed in the NSMHWB, 137 (1.2%) scored less than 24 on the Mini-Mental State Examination , indicating likely cognitive impairment, and were excluded. The remaining sample of 10,504 was made up of 5,868 women (55.7%), with 4,607 (43.9%) respondents reporting they had completed the highest level of secondary school available to them; 715 (6.8%) met CIDI and Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) diagnoses of any of the anxiety or depressive disorders in the month before interview.
Trained lay interviewers administered the K10  in a computer-assisted interview with a number of other measures. Three measures are employed in these analyses: The number of days lost in the past month to disability from the Brief Disability Questionnaire , the MCS score of the 12-item Short Form of the Medical Outcomes Study Questionnaire (SF12, Ware et al. ) as a measure of general mental health status, and diagnoses of DSM-IV Anxiety or Depressive disorders from the computerised version 2.1 of the CIDI . The CIDI has good to acceptable reliability and consistency [2, 18]. Whether participants had completed the highest level of secondary school available to them, as an index of educational attainment, was also collected in the interview.
All analyses were carried out using SPSS version 11.0.1 (SPSS Inc, 2001) on the December 2000 edition of the National Survey of Mental Health and Wellbeing confidentialized unit record file obtained from the Australian Bureau of Statistics . Predictive bias due to gender and level of education were examined using multiple regression methods first described by Cleary et al. . Initially, K10 scores were entered into regression equations predicting each of disability days, SF12 MCS, and 1-month CIDI diagnoses of DSM-IV anxiety or depression. Then dummy codes for gender (male=1, female=0), whether participant had completed secondary school (completed=1, not completed=0), their interaction, and their interaction with K10 scores were entered as a block and the change in the variance of the predicted variable accounted for was examined. This provides an overall test for the presence of slope and/or intercept bias. If this second set of variables accounted for statistically significant additional variance, partial regression analyses were examined for evidence of intercept (coefficients for gender and completing secondary school) and/or slope (coefficients for the interaction of gender with K10 and the interaction of completing secondary school with K10). Cohen's f2, the ratio of variance explained to unexplained variance [e.g. R2/(1−R2) or sr2/(1−R2)], was calculated as a index of effect size according to the methods described by Cohen et al. .
Hanley and McNeil's  methods for the non-parametric comparison of receiver operating characteristic (ROC) curves from independent samples were used to compare the ability of K10 to screen for 1-month CIDI DSM-IV diagnoses of anxiety and depressive disorders across gender and education.
Items on the K10 were scored 1 for “none of the time” to 5 for “all of the time”, giving total K10 scores ranging from 10 to 50. Scores on the K10 and the number of disability days (DD) were highly negatively skewed (K10 skewness=2.22, SE skew=0.024, DD skewness=5.76, SE skew=0.24) so appropriate normalising transformations were sought. The frequencies to z transformation  lead to a more normal distribution of K10 scores than log, square root, or inverse transformations. However, significant skewness remained (skewness=0.65, SE skew=0.024). The inverse transformation gave the most normal distribution of disability days, although again significant skewness remained (skewness=−1.18, SE skew=0.024). Because significant skewness remained, Spearman's rank correlations are also presented to check that any effects observed are not simply an artefact of transformations.
One month CIDI DSM-IV diagnoses of major depressive disorder, single episode mild, moderate, and severe (296.21, 296.22, and 296.23); recurrent episode mild, moderate, and severe (296.31, 296.32, and 296.33); dysthymia (300.4); panic disorder and or agoraphobia (300.01, 300.21, and 300.22); generalised anxiety disorder (300.02); social phobia (300.23); obsessive compulsive disorder (300.3); and post-traumatic stress disorder (309.81) were combined to give 1-month CIDI diagnoses of DSM-IV Anxiety and/or Depressive disorders.
Normalised population weights were used in multiple regression and logistic regression analyses (but not ROC analyses) because the data analysed were collected from a randomised stratified sample of the Australian population.
Bias in prediction of disability days
The effect of gender and completing secondary school on the relationship between the K10 and the number of disability days in the past months was examined by hierarchical multiple regression analysis. In the first step, transformed K10 scores were entered as a predictor of transformed disability days (adjusted R2=0.105). Gender and completing secondary school and their interactions with transformed K10 scores explained a small but statistically significant amount of additional variance in transformed disability days indicating evidence for slope and/or intercept bias [adjusted R2=0.108, ΔR2=0.003, F(5,10490)=7.849, p<0.001]. The estimated effect size for the additional variance in disability days explained by this set of variables was 0.003 (Cohen's f2) with the sample employed; the power to detect this effect was 0.997.
The partial correlations showed a small but statistically significant interaction between completing secondary school and transformed K10 scores, which indicated a slope bias in the relationship between K10 and disability days [β=−0.037, t(10490)=−3.04, p< 0.005, f2=0.0001]. Those who completed secondary school show a weaker relationship (flatter slope) between K10 and disability days than those who did not complete secondary school. Neither gender [β=−0.025, t(10490)=−1.94, p=0.052] nor completing secondary school [β=−0.013, t(10490)=−0.938, p=0.348] were statistically significant, indicating no sign of intercept bias. The interaction between gender and transformed K10 scores was also non-significant, indicating that there was no slope bias due to gender [β=0.001, t(10490)=0.111, p=0.912]. There was also no significant interaction between gender and completing secondary school [β=−0.022, t(10490)=−1.379, p=0.168].
Spearman's rho rank correlations for the relationship between K10 and disability days
Completed secondary School
Did not complete secondary school
Bias in prediction of SF12 MCS
To examine whether there is a different relationship between scores on the K10 and the SF12 MCS for men and women and for those who had completed secondary school compared with those who had not, dummy codes for these variables and their interaction with K10 scores were added to the regression of transformed K10 scores on SF12 MCS scores (adjusted R2=0.468). As above, a small but statistically significant increase in the variance in MCS scores was accounted for by gender and completing the highest available level of secondary school and their interactions with transformed K10 scores, indicating slope and/or intercept bias (adjusted R2=0.470, ΔR2=0.002, F(5,10496)=7.523, p<0.001). Again this represents a very small effect size (Cohen's f2=0.004) but because of the large sample the analysis is over powered (power=0.999).
The partial correlation analysis showed a very small but statistically significant intercept bias for gender of 0.4 of a point on the MCS greater for men [B=0.446, t(10496)=2.459, p=0.014, f2=0.0003] and 0.5 MCS point less for completing secondary school [B=−0.493, t(10496)=−2.600, p=0.009, f2=0.0003]. Very small but significant slope bias was evident in the interaction between gender and transformed K10 scores [B=0.374, t(10496)=2.303, p=0.021, f2=0.0002] and the interaction between completing secondary school and transformed K10 scores [B=0.360, t(10496)=2.180, p=0.029, f2=0.0002]. Because K10 scores were transformed in this analysis, the B coefficients for interaction terms are not directly interpretable. There was no significant interaction between gender and completing secondary school [B=−0.228, t(10496)=−0.869, p=0.385].
Spearman's rho rank correlation between K10 and SF12 MCS scores
Completed secondary School
Did not complete secondary school
Bias in prediction of 1-month CIDI mental disorder
Logistic regression analyses were employed to examine predictive bias or differential validity in the relationship between K10 scores and 1-month DSMIV diagnoses of anxiety or depressive disorders according to the CIDI due to gender or completing secondary school. K10 scores were recoded into four categories: low (10–15), moderate (16–21), high (22–29), and very high (30–50) following the Australian Bureau of Statistics . In this way bias around commonly used cut-off points for screening could be examined.
Gender, completing secondary school, and their interactions with K10 scores in the above four bands explained significant additional variance in 1-month diagnoses of anxiety and depression over and above that explained by K10 scores alone (−2log likelihood=4006.14, Δ−2log likelihood=χ2(9)=24.747, p=0.00326). However, as before, any differential validity or predictive bias was very small at less than 1% of additional variance explained (ΔNagelkerke R2=0.006). It is also unclear whether this additional variance reflects a slope or intercept bias as none of the relevant Wald tests for gender, secondary school, or their interactions with K10 scores were statistically significant.
Differences due to gender and completing the highest level of secondary school between ROC curves of the K10 as a screen for 1-month CIDI DSM-IV diagnoses of anxiety and depression
AUC (95% CI)
Difference in AUC (SE of difference) and probability of z test
Men who finished secondary school
Women who did not finish secondary school
Women who finished secondary school
Men who did not finish secondary school
−0.012 (0.015) p=0.797
0.010 (0.014) p=0.236
0.013 (0.017) p=0.210
Men who finished secondary school
0.022 (0.014) p=0.053
0.026 (0.017) p=0.061
Women who did not finish secondary school
0.004 (0.015) p=0.408
Women who finished secondary school
There is statistically significant evidence of slope and/or intercept bias due to gender and education in the relationships between the K10 and measures of disability but not current CIDI diagnoses. Women who did not complete the highest available level of secondary schooling available to them showed the strongest relationships whereas relatively weaker relationships were shown for women who completed secondary school (in the case of disability days) and men who completed secondary school (for SF12 mental component score). These results may indicate predictive bias for the K10, despite the use of methods from item response theory that minimised any item biases during development. However, these effects were very small (effect sizes ranging from 0.0001 to 0.004), are only likely to be seen in very large samples, and thus are likely to be of little concern for most purposes.
There was also little evidence of predictive bias in the relationship between K10 and CIDI DSM-IV diagnoses of anxiety and depressive disorders. That these biases have been found to be small and non-significant in a large community sample gives support for the predictive validity of the K10. The relationship between K10 and diagnoses of anxiety and depression is perhaps the most important justification for the use of the K10 as an indicator of the mental health of populations.
The lack of meaningful gender and educational predictive biases does not provide any information about the effect of other socio-cultural variables. It is therefore important to evaluate other socio-cultural variables because of the use of the K10 to monitor the mental health of multicultural communities. Within the current data set, there were too few respondents from too broad a range of linguistic and cultural groups to evaluate the effect of ethnicity on the validity of the K10. Evaluating whether the K10 remains a good measure of psychological distress and predictor of other measures of mental health in different languages, and hence cultural and ethnic groups, remains a research priority.
It is important to note that this paper relies upon self-report measures of all the concepts under examination. Although this is common practice in psychiatric epidemiology, it is likely that self-report methods do not capture the breadth of each concept under study.
The analysis reported in this paper found inconsistent evidence of very small gender and educational predictive biases in the relationship between K10 scores and other measures of mental health. That these effects are very small, if statistically significant, adds support to the growing evidence about the predictive validity of the K10. In keeping with the use of item response theory in the development of the K10, the performance of these ten simple questions is quite remarkable. However, a thorough going analysis of the strengths and weakness of the scale needs to continue.
The design, development, and execution of the Australian National Survey of Mental Health and Wellbeing was funded by the Australian Commonwealth Department of Health and Family Services. The development of the survey instrument was carried out by Prof. Gavin Andrews, Dr. Lorna Peters, Dr. Tim Slade and others at the WHO Collaborating Centre in Mental Health at St. Vincent's Hospital, Sydney. The design, development, and conduct of the survey was overseen by Profs. Scott Henderson, Gavin Andrews, Wayne Hall, Helen Herrman, Assen Jablensky, and Bob Kosky. Fieldwork and compilation of the data were conducted by the Australian Bureau of Statistics.