The research population consisted of employees on sick leave of three Dutch companies: an academic hospital, an university and a steel company. A diversity of job functions were included in this study. The specific inclusion criteria were: full or part-time on sick leave, duration of sick leave shorter than 8 weeks and no period of sick leave with the same reason within 1 month before the current episode to select only incident cases of work disability.
Recruitment Study Population
All employees who were on sick leave for more than 1 week received the Distress Screener together with an explanatory letter from the OP. The respondents who met the inclusion criteria and were willing to participate, were divided in two groups according to the score of the Distress Screener with cut-off point four. This cut-off point was based on a required level of specificity, calculated in a group of non-selected primary care patients (N = 2127) and was used for the intervention study: the ADAPT study. The research population in the current paper is based in part on the population of the ADAPT study. The ADAPT study is a randomized controlled trial evaluating the cost-effectiveness of a participatory workplace intervention compared with usual care for sick-listed employees with distress . The workplace intervention is a stepwise approach in which an employee and supervisor identify and prioritize obstacles and solutions for a return to work guided by a mediator. The intervention is aimed to reach consensus between a sick-listed employee and his or her supervisor about a plan for return to work. In the period between April 2006 and May 2007, respondents who were screened positive according to the Distress Screener were recruited for the ADAPT study . In addition, a sample of non-distressed (screened negative) employees on sick leave were recruited from the same three companies in the period January till May 2007. Permission was obtained from the Medical Ethics committee and all respondents provided informed consent.
Within one till 2 weeks after filling in the Distress Screener, the respondents filled in the 4DSQ. In addition, data of OPs diagnoses were obtained from the medical file of each employee. These diagnoses were a proportional breakdown of the types of health problems that were typical for this population and which led to extended sickness absence. OPs in the Netherlands classify diagnoses according to the international classification of diseases (CAS)  which is based on the ICD-10.
The Distress Screener is a short questionnaire which comprises three items of the 4DSQ distress subscale: “During the past week, did you suffer from worry?”, “During the past week, did you suffer from listlessness?” and “During the past week, did you feel tense?”. The selection of items was made a priori from a dataset consisting of 2,127 primary care patients. The three items were chosen based on their factor loadings and ‘difficulties’, so as to maximize the discrimination between subjects with 4DSQ distress-scores ≤10 and subjects with 4DSQ distress-scores >10. The response scale contains three options: “no” (0), “sometimes” (1), and “regularly or more often” (2). A total score was constructed by summing up the answers on the three items. The cut-off point that discriminates between ‘screened positive’ and ‘screened negative’ was established on a score of 4 or higher. A positive score means that the person involved is scored as distressed according to the Distress Screener.
The 4DSQ is a questionnaire comprising 50 items distributed over four scales: the distress scale contains 16 items (range 0–32), the depression scale contains 6 items (range 0–12), the anxiety scale contains 12 items (range 0–24) and the somatisation scale contains 16 items (range 0–32) . The reference period is “the past week”. The response scale contains five categories: “no” (0), “sometimes” (1), “regularly” (2), “often” (2), “very often or constantly” (2). The item scores are summated to scale scores. Discrimination between ‘cases’ and ‘non-cases’ were established for distress score >10, somatisation score >10, depression score >2, and anxiety score >7 [17, 18, 24]. Measuring distress with the 4DSQ has been shown to be a valid and reliable measurement . According to Van Rhenen et al.  the distress score of >10 is appropriate for use in studies of distress in working populations and therefore used as reference standard in this study.
OPs diagnoses [CAS-codes based on the ICD-10 and developed by the Dutch Association of Occupational Medicine (NVAB) and the Employed Persons’ Insurance Administration Agency (UWV)] were classified into six categories of CMDs. Four categories of diagnoses were related to the subscales of the 4DSQ, ‘Distress’ (‘Stress-related complaints’), ‘Somatisation’, ‘Depression’ and ‘Anxiety’, plus the categories of ‘Other psychological complaints’ and ‘Other complaints’ (see Table 1).
In order to adjust for the stratified sampling procedure, we used weight factors for all analyses (0.666 for screen-positive and 1.308 for screen-negative employees) to ensure that the population reflected the composition of the source population of incident sick-leave cases.
Using the 4DSQ distress score >10 as reference standard, the receiver operator characteristic (ROC) curve of the Distress Screener was obtained for a range of seven cut-off values (0–6). Sensitivity and specificity were calculated for each cut-off value. With regard to this study, identification of distress by the Distress Screener in an early stage of sick leave, it was important to select a high number of true-positively classified distressed persons and restrict the number of false-positively classified healthy persons. Therefore, determination of the optimal cut-off point was based on a high sensitivity value with the most appropriate specificity value. To examine the validity of the Distress Screener the Pearson correlation coefficients were calculated between the total score of the Distress Screener and the total score of each 4DSQ subscale. We compared correlated correlation coefficients, according to the method described by Meng et al. , including a Bonferroni correction. The degree of similarity between two repeated measurements, the test–retest (12-day) reliability, was obtained by computing the Pearson correlation coefficient of the total score of the three items of the Distress screener with the total score of the same three items of the 4DSQ distress subscale.
Furthermore, the validity was examined by comparing the outcomes of the Distress Screener (screened negative and screened positive) with OPs diagnoses (categorized CAS-codes). Both relations of the Distress Screener with OPs diagnoses and the 4DSQ distress subscale with OPs diagnoses were compared. Sensitivity and specificity values and positive and negative predictive values were determined from the established outcomes. Outcomes of both the Distress Screener and the 4DSQ distress subscale were dichotomised.