The mean observed continuous sickness absence was 100.8 days (median 48 days). Sick leaves in females lasted a mean of 105.1 days, compared to 94.6 days in men (medians 55 and 43 days, respectively). The mean length among persons with musculoskeletal disorders was 90.2 days in 335 males and 108.6 days in 489 females. The mean length among persons with mental disorders was 120.6 days in 56 males and 90.0 days in 113 females.
The mean length of the sick leave in the responder group was 107.4 days (95% confidence interval, CI, 88.7–126.1 days), compared to 92.4 days in the 343 non-responders. Stratified analysis revealed longer mean sick leaves among responders 40 years and younger, of 109.3 days (95% CI 81.4–134.5 days), compared to the 79.3 days (95% CI 65.6–93.1 days) in non-responders. Stratification on gender or musculoskeletal or mental disorders did not reveal any significant differences in the length of sick leave between responders and non-responders.
All assessors, including the sick-listed themselves, systematically overestimated the length of short sick leaves (lasting 4–11 weeks) and underestimated the length of long sick leaves (exceeding 16 weeks; Table 1). The proportions of sick leaves lasting longer than 8, 12 or 26 weeks did not differ significantly between the responder group and the rest.
Receiver operating characteristics of prediction
The sick-listed subjects predicted sick leaves equal to or longer than 12 weeks more accurately than the NIO medical consultants and officers, as shown by the ROC curve in Figure 2. The differences in ROC area between responders and non-responders were most marked among younger subjects and in females (Table 2). Generally, the length of sick leave was predicted more accurately in older subjects than in younger subjects, and better in males than in females. Access to past history of sick leaves improved the ROC area of NIO consultants from 60.6% (95% CI 51.3–69.9%) to 75.4% (95% CI 68.2–82.6%) in male sick-listed, but did not improve the ROC area in assessments of female sick-listed.
Changing the observed length to be identified from 12 weeks to 8 or 26 weeks did not significantly change the diagnostic accuracy as assessed by the ROC area. The sick-listed identified sick leaves lasting 8 weeks or longer with a ROC area of 79.5% (95% CI 72.2–85.6%), and sick leaves lasting 26 weeks or longer with a ROC area of 75.5% (95% CI 67.9–82.1%). Sick-listed persons with mental disorders or with neck, or shoulder and arm disorders, were most accurate in their assessment (Figure 3). This was in contrast to NIO assessors, who demonstrated the lowest predictive ability in these diagnostic groups, particularly in responders. The impact on diagnostic accuracy of knowing the occupation was small.
Sensitivity, specificity, predictive value and likelihood ratio
The sick-listed subjects predicted their sick leaves with higher sensitivity and PPV than the NIO assessors (Tables 3, 4). Male sick-listed predicted sick leaves lasting at least 12 weeks with a sensitivity of 0.82% (95% CI 0.60–0.95) and a PPV of 0.78 (95% CI 0.56–0.93) using predicted length of at least 8 weeks. The corresponding sensitivity and PPV of female sick-listed were both 0.61 (95% CI 0.44–0.77).
Duration of at least 8 weeks was the preferable cut-off in predicted length, to identify sick leaves lasting at least 12 weeks (Table 3). A predicted length of at least 12 weeks reduced the sensitivity in all the data to 0.17 in medical consultants and 0.25 in officers. The corresponding improvement in PPV was modest, reaching 0.54 in medical consultants and 0.45 in officers. Using a predicted length of at least 4 weeks would have markedly reduced the specificity (Figure 2).
The sensitivity of identifying sick leaves lasting at least 26 weeks was generally low when medical consultants and officers predicted on the basis of SC1s. (Table 4). The sensitivity was improved somewhat by introducing SC2 information, but the effects on likelihood ratio and PPV if prevalence corrected, were minor.
According to the results, the effects of the different predictive strategies can be illustrated by considering a program designed to intervene in all cases where the subject is expected to be sick-listed for more than 12 weeks at 14 days of sick leave. Out of every 1000 sick-listed persons, 333 will be sick-listed for more than 12 weeks according to the prevalence in this study. The random selection of 333 persons will include 111 true positives, while 333 persons selected by officers will include 133 of the 333 persons that will be sick-listed at least 12 weeks. The evaluation of 1000 sick-listed individuals thus increases the number of true positives by 22 in a selection of 333 sick-listed persons. The alternative strategy of asking the sick-listed themselves will include 210 true positives in a selection of 333 persons.
Reliability and reproducibility of the predicted length
Agreement between medical consultants in their initial prediction of sick leaves lasting at least 12 weeks, was fair, with a kappa of 0.31 (95% CI 0.20–0.43). The corresponding kappa value between officers was 0.05 (95% CI -0.05–0.14).
In the prediction of sick leaves lasting at least 12 weeks based on the SC2, agreement was moderate between medical consultants (kappa = 0.42, 95% CI 0.29–0.54) and fair between officers (kappa = 0.26, 95% CI 0.10–0.42). The corresponding agreements in the prediction of sick leaves lasting at least 26 weeks were moderate between medical consultants (kappa = 0.55, 95% CI 0.40–0.70) and fair between insurance officers (kappa = 0.31, 95% CI 0.17–0.47).
The differences in diagnostic accuracy, between the two participating medical consultants and their eight colleagues in the reproducibility group, were not significant.