Background

Work-related psychosocial factors or "stressors at work" have been linked to increased risk of ill health and mortality in some [18] but not all studies [912]. The reasons behind these inconsistencies may include differences in the socio-demographic characteristics of the study populations, variations in the stability of the work stressors during the follow-up, selection bias, and imprecise measurement, particularly of the exposure [13, 14]. Surprisingly little attention has been paid to the fact that the operationalisation of work-related stressors have varied between cohort studies.

A frequently used measure of work stress is the two-dimensional job strain model, originally described by Karasek [15] and further developed, both empirically and psychometrically, by Karasek and Theorell [16]. This model postulates that jobs characterised by a combination of high psychological demands and low control (or low decision latitude), that is, job strain, are likely to elicit work-related psychosocial stress. High psychological demands in the workplace mean that the employee has to work intensively or rapidly, and may experience conflicting expectations. Job control, in turn, refers to the degree of decision-making authority (for example, having an influence on what task to do and how to carry it out) and skill discretion (e.g., the use of personal skills on the job).

The two standardised, widely used questionnaires developed to measure the demand and control dimensions, and hence job strain, are: the Job Content Questionnaire (JCQ) [17] and the Demand Control Questionnaire (DCQ) [18, 19]. The number of demand (five in both the JCQ and DCQ) and control items (nine in the JCQ and six in the DCQ) vary somewhat, as do the enquiries, and their response scales differ. Most of the JCQ items are expressed as statements and the respondents are asked to report if they agree or disagree with the statement on a four level Likert scale, while DCQ items are expressed as questions with the response options being frequency based (e.g., "Do you have to work very fast?" - Often, sometimes, seldom, never).

Investigators examining the relationship between job strain and health outcomes have often used partial versions of the questionnaires. Study-specific questionnaires have also been developed which can differ from the JCQ or DCQ in terms of the number of items, content and wording of the questions, and response alternatives. It is important to understand whether the different survey instruments assess the same underlying concepts, as this has implications for the interpretation of findings across studies, harmonisation of multi-cohort data for pooled analyses, and design of future studies [20].

Accordingly, the aim of the present analyses was to evaluate the comparability of alternative job demands, job control and job strain measures by assessing agreement against the complete scale. To do so, we use data from six studies together with information from additional 11 European cohort studies that comprise the "Individual-participant-data meta-analysis in working populations" (IPD-Work) Consortium.

Methods

Study population

This study is part of the "Individual-participant-data meta-analysis in working populations" (IPD-Work) consortium of European cohort studies. This collaboration was established at a workshop, the annual Four Centers Meeting in London, in November, 2008. New cohort studies have subsequently been added. The overall aim of the IPD-Work consortium is to aggregate raw data from a series of studies in order to obtain reliable estimates of the influence of psychosocial risk factors at work on chronic diseases, mental health, disability, and mortality.

In IPD-Work, a pre-defined two-stage data acquisition protocol is being used. The first stage involves the acquisition of baseline data on work stress as well as socio-demographic and lifestyle factors and the definition and harmonisation of these baseline characteristics across the studies. The second stage involves the acquisition of data on disease outcomes ascertained subsequent to the baseline survey. The present analyses were based on stage one only and were thus conducted before any linkage to disease data, planned for the second stage.

We examined the agreement between complete and partial psychological demands and control scales separately in the six cohort studies of the IPD-Work consortium that had the complete job demands and job control scales. These were: the Job Stress Study I (Belstress, Belgium) [21]; the Gaz et Electricité Cohort Study (GAZEL, France) [22]; the Health and Social Support in Finland Study (HeSSup, Finland) [23]; the Swedish Longitudinal Occupational Survey of Health (SLOSH, Sweden) [24, 25]; the Work, Lipids and Fibrinogen Study Norrland (WOLF N, Sweden) [26]; and the Work, Lipids and Fibrinogen Study Stockholm (WOLF S, Sweden) [26]. This resulted in an analytic sample of 70 751 participants. The IPD-Work cohort studies where only partial scales were available were: the Copenhagen Psychosocial Questionnaire Study, Denmark [27]; Danish Work Environment Cohort Study, Denmark [28]; Still Working study, Finland [29]; Finnish Public Sector study, Finland [30]; Heinz Nixdorf Recall Study, Germany [31]; Intervention Project on Absence and Well-being Study, Denmark [32]; KORA-Cooperative Health Research in the Region Augsburg/MONICA, Germany [33]; Netherlands Working Condition Survey, the Netherlands [34]; Permanent Onderzoek Leefsituatie/Continuous Survey on Living Conditions, the Netherlands [35]; PUMA-Copenhagen Burnout Inventory Study, Denmark [36]; and Whitehall II study, the United Kingdom [37]. We used the information on available job demands and control items in these 11 cohort studies to identify partial scales relevant to compare with the complete scales. The details of the design and participants in all 17 IPD-Work studies have been published previously [2137] and are briefly described in Additional file 1.

The studies from which data was used in the present analysis were approved by the ethics committees of the University Hospital of Ghent and the Faculty of Medicine of the Université Libre de Bruxelles (Belstress); the National Commission Overseeing Ethical Data Collection in France (Commission Nationale Informatique et Liberté) (GAZEL); the Turku University Central Hospital Ethics Committee (HeSSup); the Regional Research Ethics Board in Stockholm (SLOSH); and the ethics committee at Karolinska Institutet, Stockholm (WOLF N and WOLF S). All studies adhered to the Helsinki Declaration on ethical principles for medical research involving human subjects. Details of ethical approval of all the IPD-Work studies are described in Additional file 1.

Definition of the complete and partial scales of job demands and control

Complete scale measures of job demands and job control were based on five items from the psychological demands scales and six items from the control scales from the JCQ and DCQ (referred to hereafter as the "complete scales" - see table 1). This represented our referent. We omitted the three additional control items in the JCQ that did not have a corresponding item in the DCQ as a means of improving the harmonisation of the control scale across studies. We constructed partial scales based on the JCQ/DCQ items that were available in each of the IPD-work studies that did not have the complete scales. This resulted in a total of six partial demand scales; five partial control scales; and 10 partial job strain scales. The individual questionnaire items available in each study are presented in table 2.

Table 1 Items from the Job Content Questionnaire (JCQ) and Demand Control Questionnaire (DCQ) included in the IPD-Work
Table 2 Job demand and control items in the cohorts included in the IPD-Work meta-analysis (questionnaire used in study)*

All studies included in the IPD-Work Consortium were designed and initiated before the IPD-Work Consortium began; the choice of instrument to measure job strain therefore varies between studies. In some studies, the wording of the job strain items differed from those in the original JCQ or DCQ, but was judged by the study coordinating authors (EF, SN, KH, TT and MiK) to sufficiently resemble the original questions such that they could be used as proxy items. For example, the question on conflicting demands in Still Working study was expressed as "Do your superiors or workmates give you contradictory orders or instructions?" as compared with the corresponding item in the DCQ "Does your work often involve conflicting demands?"; and in POLS the item "Do you have to work under great time pressure?" was judged to capture the same content as "Do you have enough time to do everything?" in the DCQ. The scales with proxy items are labelled as "other" in table 2. Some scales were very similar to the JCQ or DCQ and only differed from them in minor aspects; they are labelled as "mainly JCQ" or "mainly DCQ" in table 2.

Using our analytical sample from the six studies with the complete scales, mean response scores for job demands items and for job control items were calculated for each study participant, for both the complete and the different partial scales. For each scale, the mean response score was calculated for participants who had answered half or more of the demand or control questions in that specific scale. However, where only two items were used in a partial scale, both items had to have non-missing values for the mean score to be calculated.

The presence of job strain was defined as having high demands (i.e., higher than the study-specific median of the demands scores) and a low control score (i.e., lower than the study-specific median of the control scores). This definition of job strain based on the quadrant approach has been widely used and will be the main method to define job strain in the IPD-Work consortium, including the present analyses. However, other approaches to derive measures of job strain from the demands and control scores have also been proposed, including the quotient method (job demands/job control); the logarithmic approach (log[job demands/job control]); and the subtraction approach (job demands minus job control) [38]. As a subsidiary analysis, we evaluated the agreement between the complete and partial job strain scales when applying these alternative job strain definitions, shown in Additional file 2.

Statistical analysis

The relationship between the complete and partial scales for the demands and control scales was ascertained using Pearson correlation coefficients with accompanying 95% confidence intervals. These were computed using Fisher's transformation. Sensitivity, specificity and Kappa (κ) statistics were calculated to evaluate the agreement between the job strain definitions based on the complete versus partial scales. The following interpretations of the Kappa statistic were utilised [39]: 0.00-0.20 indicates slight agreement, 0.21-0.40 fair, 0.41-0.60 moderate, 0.61-0.80 good/substantial strength of agreement, and 0.81-1.00 a very good/almost perfect agreement. All analyses were performed using SAS version 9.1 (SAS Institute Inc., Cary, NC, USA).

Results

Pearson's correlation coefficients between the complete demand scale and partial scales are shown in table 3. In all six studies included in the present analysis, the correlation coefficient was r > 0.94 for partial scales comprising four items, and r > 0.90 when the partial scale was based on three items. For partial scales with only two items, the correlation coefficient was somewhat lower (r = 0.76 to 0.88), depending on the cohort and item content.

Table 3 Correlation between the complete psychological job demands scale vs. shorter versions of the scale

Table 4 shows that a largely similar pattern of correlations was observed for the control scale. The correlation coefficients between the complete scale with six items and the partial scales comprising five items were very high (r ≥ 0.96), whereas the relationship between the complete scale and partial scales with two items were slightly lower in magnitude (r = 0.81 to 0.87).

Table 4 Correlation between the complete job control scale vs. shorter versions of the scale

Table 5 presents the sensitivity, specificity and Kappa statistics comparing the job strain definition based on complete and partial job demands and control scales. There was a consistent pattern across the studies, with the agreement between job strain definitions based on complete and partial scales being very good (k > 0.80), and sensitivity > 0.80 in 14 of 18 analyses, when only one item of either job demands scale or job control scale was missing. When the job strain definition was based on three demand items and all six control items, the agreement varied between good and very good (κ > 0.68). This was also seen, with one exception, for job strain definitions based on only two demand items but all six control items. When one or more items were missing in both the demands and control scales, most Kappa statistics (n = 18/24) indicated at least good agreement (κ > 0.60), although for some comparisons (n = 6/24) the agreement was moderate (κ = 0.54 to 0.60). As expected, the sensitivity decreased when several items were missing in one or both scales. The subsidiary analyses using alternative methods to define job strain yielded a similar pattern of results as the main analysis defining job strain by the quadrant approach (Additional file 2, tables 4 and 5).

Table 5 The agreement between job strain definitions using the complete vs. partial scales

Discussion

The aim of the present analyses was to evaluate the comparability of alternative job demands, job control and job strain measures by assessing agreement against the complete scales. To do so, we analysed data from a total of 70 751 participants in six European cohort studies with complete data on five job demand and six control items. We found very high correlation coefficients between the complete and partial job demands and control scales, which included a minimum of three items. The agreement for the dichotomised job strain measure was 'good' to 'very good' when at least one of the underlying subscales was complete. When one or more of the items of the underlying scales were excluded, this agreement ranged from moderate to good, depending on the specific items left out in the partial versions.

Strengths and weaknesses

The main strength of the present study is the utilisation of data from multiple independent cohort studies which collectively comprised a very large analytical sample, so providing a high level of statistical precision. Despite slight variation in the exact wording of the questionnaire items between these studies, our findings were remarkably consistent across the six cohort studies drawn from Belgium, France, Finland and Sweden. This supports the generalisability of the present analysis across different settings in four European countries. However, in the studies included in the IPD-Work Consortium, no standardised procedure had been followed to translate the original job strain questionnaire across all studies. Thus, cross-cultural adaptation of the job strain instrument remains a source of error for an individual-participant data meta-analysis project on job strain. Based on a review of job demand and control questions used in 17 European cohort studies, our tests were limited to a total of 10 different partial scales that were available in these studies. It is therefore unclear how other scale modifications, including those with additional study-specific questions, agree with the complete scales. These differences, as well as those related to translation and cultural meaning of the wording, may affect the assessment of demands and control in international comparative studies [40].

Comparison with previous studies

Previously, a comparison of JCQ and DCQ-like questionnaires has been conducted in 682 participants in the JACE study [20]. The investigators in that study found a moderate agreement between median-based job strain classification using a 14-item JCQ (five demand items, nine control items) compared with the 11-item DCQ (five demand items, six control items). Attempts were made to increase the comparability between the scales by developing comparability-facilitating algorithms, as well as using regression models, in order to convert the DCQ scores to the same scale as the JCQ scores. However, with regard to the median based job strain classification, these transformations did not meaningfully improve the agreement between the JCQ and DCQ [20]. We chose a different approach to harmonise the JCQ and DCQ scales, and used the five and six comparable items in the JCQ and DCQ as the "complete" scales. These provided us a reference measurement to examine the validity of partial versions available in the existing cohort studies.

Implications

In epidemiological studies, researchers often have to make a trade-off between the amount of data they would like to collect and the amount of data it is possible to collect without increasing the non-response due to overly burdensome inventories. In these circumstances, it is not uncommon for the original scales to be abbreviated. Research using job strain as the exposure measure has not always been based on validated standardised protocols and modifications in the questionnaire may have contributed to inconsistencies in the observed job strain-disease association across various studies. For example, in a recent meta-analysis of ten prospective cohort studies, the authors reported a pooled relative risk of 1.4 (95% confidence intervals 1.1-1.8) for incident coronary heart disease among participants with job strain compared to those without [41]. However, there was significant heterogeneity among datasets, with effect estimates close to the null value, or even opposite to the expected direction, in some studies [41]. Even though our study showed reasonable agreement for all the investigated partial scales with the complete scale, the agreement as well as the sensitivity decreased when several items were missing in the two sub scales of the job strain measure. Lower sensitivity implies increased risk of misclassification of job strain when using abbreviated scales, which may attenuate or inflate a true relationship between job strain and an outcome of interest. When using abbreviated scales the item content is important to consider. The control scale in the job strain index is composed of two dimensions -skill discretion and decision authority- and both these dimensions should be covered in a partial scale to measure the same construct as the complete JCQ/DCQ scale. This was the case in all the 17 studies comprising IPD-Work.

Conclusions

Information on the agreement between alternative operationalisations of job strain may help with the interpretation of previous findings, and harmonisation of multi-cohort data for pooled analyses. Our study provides information on the agreement between complete and partial job strain scales based on existing data from several European cohort studies. A high agreement for partial scales with at least half of the items of the complete scales, and an accurate classification of job strain when at least one of the scales has no missing items, suggest that these abbreviated scales assess the same underlying concepts as the complete survey instrument. However, all the partial scales in the present study (including the subscales comprising only two items), showed high to reasonable agreement with the complete scales. In order to capture the theoretical background for job strain and to measure the same construct as the complete JCQ/DCQ, it is important that the abbreviated control scales cover both the skill discretion and the decision authority dimensions.