The results of the present study question any practical value of using information in medical sickness certificates in predicting the length of sick leave, as is the current practice in Norwegian NIOs. Instead, the sick-listed themselves predicted their length of sick leaves far more accurately, but this information is not routinely sought.
Representativeness
The officers in the present study were selected from experienced officers who had shown an interest in the field of sick leave. This might introduce a bias of overestimating the officers' general ability to predict the length of sick leaves. The performances of the two medical consultants were representative of eight of their colleagues who participated in the reproducibility part of the study. We therefore consider the diagnostic accuracy of the assessors to be representative of their professional groups, or at least not underestimated due to bias. Although the diagnostic accuracy varied within each group, the main conclusion of better predictive ability among the sick-listed, was challenged neither by comparing with the mean length predicted by assessors, nor by comparing with the best-performing NIO assessor.
The distributions of gender and diagnosis among the 993 persons included in the study were comparable with those in the National Sickness Benefits Register. The findings of longer sick leaves in women with musculoskeletal disorders, and longer sick leaves in men with mental disorders, are consistent with the Register and other studies [11–13].
The low responder rate among the sick-listed introduced a possible selection bias, although we could not identify any selection bias in gender, age, diagnosis or occupation [14]. If there was a selection towards more predictable sick leaves, this should have been reflected in the assessments of officers and medical consultants. The general trend of lower diagnostic accuracy of NIO assessors in the responder group indicates that if any selection bias contributes to the results, it is an underestimate of the self-predictive ability.
Why did the sick-listed make better predictions?
If the lengths of sick leaves were predominantly related to loss of function caused by sickness, in line with the legislation, we would expect that the medical consultants' professional competence would favour them in predictions of the lengths of sick leaves. The differences we observed between medical consultants and officers in mean ROC area, were however minor. Furthermore, we could not demonstrate any significant differences in diagnostic accuracy between medical consultants and officers when aggregate information on disease, treatment, function related to work, and prognoses were available in the SC2. The improvement in ROC area with this aggregated information was minor, with the area just reaching 70%, which is considered borderline useful for some purposes [7]. The result is in line with Bjørndal's findings of low prognostic impact of the SC2 [15], and is supported by findings of a low predictive power of symptoms and signs in neck and shoulder disorders [16]. The better prediction of the length of sick leave by the sick-listed themselves, is supported by studies that have identified different non-disease determinants of sick leave, such as job satisfaction [17], attitudes towards pain [18], irreplaceability [19] and psychosocial work environment [20–22]. Studies identifying that at least the initial sickness certification is predominantly patient controlled [23, 24] indicate the competence of the sick-listed. Self-rated health seems to be an independent predictor of return to work [17], disability pension [25] and early retirement [26]. Our findings can be interpreted as indicating that the subjective perception of sickness and work ability is more predictive of the length of sick leave, than the apparently more objective description in medical terms. The differences in predictive ability were especially significant in persons with mental and neck disorders, while the NIO assessors performed equal to the sick-listed in the more clear-cut injuries with more standardised treatment and prognosis. Mental disorders, with high prevalence in the population, and an increasing cause of absence [27], are of special interest [13]. This increasing prevalence of sick leaves indicates the presence of factors separate from the diagnosis criteria. It seems that the more clear-cut the disease and the recommended treatment, the lesser the gain in predictive ability achieved by asking the sick-listed, and vice versa. The modest gain in predictive ability caused by introducing more medical information by the inclusion of the SC2 supports this interpretation. A more complete description of symptoms and treatment does not necessarily give better prognostic information when this includes little knowledge of the consequences related to occupation, and the effects of treatment are undocumented or, at best, marginal.
Diagnostic accuracy – practical implication
The Norwegian NIO is obliged by legislation to perform early intervention on the sick-listed in an effort to reduce the length of sick leave and the risk of expulsions from work. Limited resources and the large number of sick-listed individuals make selection desirable before any intervention is initiated. An alternative to selection on the basis of medical certificates is to communicate directly with the sick-listed themselves. This selection for intervention by NIOs might be seen as screening. The aim is to reach – at an acceptable cost – as many as possible of those that might profit from intervention. The potential individual gain by intervention will be greater when longer lasting sick leaves can be anticipated, and greater the sooner individual intervention programs are established.
The marginal predictive ability and modest agreement between NIO assessors questions the use of resources in selection based on information from medical certificates. The predictions of medical consultants tend to be better than those of officers, but not to an extent that makes it more meaningful to use medical consultants in the selection process, rather than officers.
With limited resources for intervention, it might be more cost effective to identify those whose sick listing will last longer than 26 weeks instead of 12 weeks. Based on self-reporting, eight out of ten would be true positives, and one fourth of the individuals would be reached. To reach the same number of true positives at 14 days of sick leave, the ratio of true positives would be reversed from eight out of ten, to two or three out of ten, if the selection were based on medical certificates.
In the search for tests predicting long-lasting sick leaves, such as The Örebro Musculoskeletal Pain Questionnaire [28], the present study indicates that the results of any such tests should be compared with the results of crude self-estimated length.