Skip to main content

Advertisement

Log in

Including Injured Workers Without Compensated Time Loss in Cox Regression Models: Analyzing Time Loss Using All Available Data

  • Published:
Journal of Occupational Rehabilitation Aims and scope Submit manuscript

Abstract

Introduction Cox proportional hazards regression is commonly used to analyze time loss duration, but statistical packages conventionally exclude cases with no recorded follow-up time. For this and other substantive reasons, many researchers limit time loss analyses to the subset of workers who received time loss compensation. This can exclude both injured workers who missed no work days and those missing up to a week of work. For some research questions, excluding cases where injury is reported but no time loss is recorded may result in significant ascertainment bias. We present a novel technique based on standard survival analysis methods to allow for the inclusion of all cases when appropriate. Methods A simple technique to allow standard statistical software to include both medical-only and time loss claims in Cox regression is illustrated by example and compared with a two-part model using a time-varying step function to allow regression effects to change over time. Results We showed that a pooled analysis is obtained by simply adding a small constant to the time loss duration variable. This technique produced appropriate estimates while accounting for censoring when a suitable method was used for tied event times. Using a formal statistical framework, the combined model was justified as a special case of the more standard two-part model approach. Conclusions When it is desirable to have a single pooled outcome estimate for injured workers with both medical-only and time loss claims, all claims can be combined into one statistical model. This may have particular utility for research questions where the risk factor or intervention of interest would be expected to affect time loss duration beginning upstream of claim filing or statutory compensation waiting periods. This novel alternative modeling strategy expands the tool kit available for analyzing time loss data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Cheadle A, Franklin G, Wolfhagen C, Savarino J, Liu PY, Salley C, et al. Factors influencing the duration of work-related disability: a population-based study of Washington State workers’ compensation. Am J Public Health. 1994;84(2):190–6.

    PubMed  CAS  Google Scholar 

  2. Dasinger LK, Krause N, Deegan LJ, Brand RJ, Rudolph L. Duration of work disability after low back injury: a comparison of administrative and self-reported outcomes. Am J Ind Med. 1999;35(6):619–31. doi :10.1002/(SICI)1097-0274(199906)35:6<619::AID-AJIM9>3.0.CO;2-I.

    Article  PubMed  CAS  Google Scholar 

  3. Krause N, Dasinger LK, Deegan LJ, Brand RJ, Rudolph L. Alternative approaches for measuring duration of work disability after low back injury based on administrative workers’ compensation data. Am J Ind Med. 1999;35(6):604–18. doi :10.1002/(SICI)1097-0274(199906)35:6<604::AID-AJIM8>3.0.CO;2-T.

    Article  PubMed  CAS  Google Scholar 

  4. Seland K, Cherry N, Beach J. A study of factors influencing return to work after wrist or ankle fractures. Am J Ind Med. 2006;49(3):197–203. doi:10.1002/ajim.20258.

    Article  PubMed  Google Scholar 

  5. U. S. General Accounting Office/General Government Division; Report to Congressional Requesters. Workers’ Compensation: Selected Comparisons of Federal and State Laws (GAO/GGD-96-76). 1996.

  6. Diehr P, Yanez D, Ash A, Hornbrook M, Lin DY. Methods for analyzing health care utilization and costs. Ann Rev Public Health. 1999;20:125–44. doi:10.1146/annurev.publhealth.20.1.125.

    Article  CAS  Google Scholar 

  7. Boles M, Pelletier B, Lynch W. The relationship between health risks and work productivity. J Occup Environ Med. 2004;46(7):737–45. doi:10.1097/01.jom.0000131830.45744.97.

    Article  PubMed  Google Scholar 

  8. Sears JM, Wickizer TM, Franklin GM, Cheadle AD, Berkowitz B. Nurse practitioners as attending providers for injured workers: evaluating the effect of role expansion on disability and costs. Med Care. 2007;45(12):1154–61.

    Article  PubMed  Google Scholar 

  9. Hertz-Picciotto I, Rockhill B. Validity and efficiency of approximation methods for tied survival times in Cox regression. Biometrics. 1997;53(3):1151–56. doi:10.2307/2533573.

    Article  PubMed  CAS  Google Scholar 

  10. Hsieh F. A cautionary note on the analysis of extreme data with Cox regression. Am Stat. 1995;49(2):226–8. doi:10.2307/2684645.

    Article  Google Scholar 

  11. Localio AR, Margolis DJ, Berlin JA. Relative risks and confidence intervals were easily computed indirectly from multivariable logistic regression. J Clin Epidemiol. 2007;60(9):874–82. doi:10.1016/j.jclinepi.2006.12.001.

    Article  PubMed  Google Scholar 

  12. Dasinger LK, Krause N, Deegan LJ, Brand RJ, Rudolph L. Physical workplace factors and return to work after compensated low back injury: a disability phase-specific analysis. J Occup Environ Med. 2000;42(3):323–33.

    PubMed  CAS  Google Scholar 

  13. Joling C, Groot W, Janssen PP. Duration dependence in sickness absence: how can we optimize disability management intervention strategies? J Occup Environ Med. 2006;48(8):803–14.

    Article  PubMed  Google Scholar 

  14. Krause N, Dasinger LK, Deegan LJ, Rudolph L, Brand RJ. Psychosocial job factors and return-to-work after compensated low back injury: a disability phase-specific analysis. Am J Ind Med. 2001;40(4):374–92.

    Article  PubMed  CAS  Google Scholar 

  15. Delucchi KL, Bostrom A. Methods for analysis of skewed data distributions in psychiatric clinical studies: working with many zero values. Am J Psychiat. 2004;161(7):1159–68. doi:10.1176/appi.ajp.161.7.1159.

    Article  PubMed  Google Scholar 

  16. Lee AH, Wang K, Scott JA, Yau KK, McLachlan GJ. Multi-level zero-inflated poisson regression modelling of correlated count data with excess zeros. Stat Methods Med Res. 2006;15(1):47–61. doi:10.1191/0962280206sm429oa.

    Article  PubMed  Google Scholar 

  17. Xie H, McHugo G, Sengupta A, Clark R, Drake R. A method for analyzing longitudinal outcomes with many zeros. Ment Health Serv Res. 2004;6(4):239–46. doi:10.1023/B:MHSR.0000044749.39484.1b.

    Article  PubMed  Google Scholar 

  18. Baldwin ML, Johnson WG, Butler RJ. The error of using returns-to-work to measure the outcomes of health care. Am J Ind Med. 1996;29(6):632–41. doi :10.1002/(SICI)1097-0274(199606)29:6<632::AID-AJIM7>3.0.CO;2-L.

    Article  PubMed  CAS  Google Scholar 

  19. Fulton-Kehoe D, Gluck J, Wu R, Mootz R, Wickizer T, Franklin G. Measuring work disability: what can administrative data tell us about patient outcomes? J Occup Environ Med. 2007;49(6):651–8. doi:10.1097/JOM.0b013e318058a9e7.

    Article  PubMed  Google Scholar 

Download references

Acknowledgments

The authors thank Gary M. Franklin (Medical Director, Washington State Department of Labor and Industries) for generously sharing his professional expertise and for authorizing access to this data and Sheilah Hogg-Johnson (Senior Biostatistician, Institute of Work and Health) for graciously providing feedback on this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeanne M. Sears.

Appendix: Statistical Models

Appendix: Statistical Models

In this appendix, we briefly review the statistical details of the models under discussion. The goal of this detail is to provide formal justification for our novel approach, and to explicitly show the relationship between the two-part model approach and the combined Cox regression approach. Finally, we show how a two-part model can be represented as a special case of the combined model when certain interactions are included in the combined model.

In the discussion that follows we use T = t to denote that an event occurs at time “t”. We do not represent the outcome as censored times in the presentation of the underlying regression models but rather discuss censoring as an issue that would be addressed in the estimation of these models. Finally, for ease of presentation we focus on discrete event times such that the probability that T = t is meaningful. The two-part model consists of a first-part logistic regression model and a second-part Cox proportional hazards regression model and can be represented as follows:

$$ {\text{Part 1}}\left( {{\text{odds ratio}}} \right){: \log }\left[ {{\text{P}}\left( {{\text{T}} = 0|{\text{X}}} \right)/\left\{ { 1- {\text{P}}\left( {{\text{T}} = 0|{\text{X}}} \right)} \right\}} \right] = \beta _{0} + \beta _{1}* {\text{X}} $$
(1)
$$ {\text{Part 2}}\left( {{\text{hazard ratio}}} \right){: \text{log hazard}}\left( {{\text{t}}|{\text{X}}} \right) = { \log }\left[ {\lambda _{0} \left( {\text{t}} \right)} \right] + \beta _{1}*{\text{X}} $$
(2)
$$ { \log }\left[ {{\text{P}}\left( {{\text{T}} = {\text{t}}|{\text{X}}} \right)/{\text{P}}\left( {{\text{T}} \ge {\text{t}}|{\text{X}}} \right)} \right] = { \log }\left[ {\lambda 0\left( {\text{t}} \right)} \right] + \beta _{ 1}*{\text{X}} = \beta _{0} \left( {\text{t}} \right) + \beta _{ 1} *{\text{X}} $$

The first part can be alternatively represented using a relative risk regression model rather than a logistic regression model:

$$ {\text{Part 1}}\left( {{\text{relative risk}}} \right){: \log }\left[ {{\text{P}}\left( {{\text{T}} = 0|{\text{X}}} \right)} \right] = \beta _{0} + \beta _{ 1} *{\text{X}} $$
(3)

Furthermore, the relative risk regression is equivalent to a hazard regression model when focusing on the first event time:

$$ {\text{Part 1}}\left( {{\text{hazard ratio}}} \right){: \log }\left[ {{\text{P}}\left( {{\text{T}} = 0|{\text{X}}} \right)} \right] = { \log }\left[ {{\text{P}}\left( {{\text{T}} = 0|{\text{X}}} \right)/{\text{P}}\left( {{\text{T}} \ge 0|{\text{X}}} \right)} \right] $$
(4)

The equivalence is due to the fact that P(T ≥ 0) = 1 by definition (of time loss or any event time). Now we can re-write and subsequently combine these two models, labeling each coefficient to show from which part of the model it originates and how it is represented in the combined model:

$$ {\text{Part 1}}\left( {{\text{hazard ratio}}} \right){: \log }\left[ {{\text{P}}\left( {{\text{T}} = 0|{\text{X}}} \right)/{\text{P}}\left( {{\text{T}} \ge 0|{\text{X}}} \right)} \right] = \beta _{0} \left[ 1\right] + \beta _{ 1} \left[ 1\right]*{\text{X}} $$
(5)
$$ {\text{Part 2}}\left( {{\text{hazard ratio}}} \right){: \log }\left[ {{\text{P}}\left( {{\text{T}} = {\text{t}}|{\text{X}}} \right)/{\text{P}}\left( {{\text{T}} \ge {\text{t}}|{\text{X}}} \right)} \right] = \beta _{0} \left[ 2\right]\left( {\text{t}} \right) + \beta _{ 1} \left[ 2\right]*{\text{X}} $$
(6)
$$ {\text{Combined}}{: \log }\left[ {{\text{P}}\left( {{\text{T}} = {\text{t}}|{\text{X}}} \right)/{\text{P}}\left( {{\text{T}} \ge {\text{t}}|{\text{X}}} \right)} \right] = \beta _{0} \left( {\text{t}} \right) + \beta _{ 1} \left[ 1\right]*{\text{X}} + \gamma *\left( {{\text{t}} > 0} \right)*{\text{X}} $$
(7)

where γ = β1[2] − β1[1].

Thus the covariate effect for “early” is β1[1] and the covariate effect for “late” is β1[1] + γ. This shows that fitting a combined model with appropriate interaction terms is equivalent to a two-part model using relative risk regression for the first part. In addition, a test of γ = 0 allows one to test whether the part[1] and part[2] model coefficients are equal and could therefore be used to justify the fitting of a combined model when they are found not to be significantly different. Finally, if appropriate the effect modification by time (i.e., γ) can be removed and a single overall pooled estimate obtained, β1 = β1[1] = β1[2].

The antilog of the coefficient β1 in the part 1 model (Eq. 5) can be interpreted as the relative risk (or “hazard”) of having no time loss (versus some time loss) comparing those in the intervention group with those in the reference group (whereas the antilog of the coefficient for the part 1 logistic model, Eq. 1, can be interpreted as the odds of having no time loss for the intervention group relative to the odds for the reference group, which approximates relative risk in limited circumstances). The antilog of the coefficient β1 in the part 2 model (Eq. 6) can be interpreted as the relative risk (or “hazard”) of ending time loss comparing those in the intervention group with those in the reference group, among those having some compensated time loss. The antilog of the coefficient β1 in the combined model (Eq. 7) can be interpreted as the relative risk (or “hazard”) of ending time loss among all injured workers, comparing those in the intervention group with those in the reference group, assuming that γ = 0 (no difference in the relative risk of ending time loss among those who have no compensated time loss compared with those who have compensated time loss).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sears, J.M., Heagerty, P.J. Including Injured Workers Without Compensated Time Loss in Cox Regression Models: Analyzing Time Loss Using All Available Data. J Occup Rehabil 18, 225–232 (2008). https://doi.org/10.1007/s10926-008-9144-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10926-008-9144-1

Keywords

Navigation