Skip to main content
Log in

Semiparametric predictive inference for failure data using first-hitting-time threshold regression

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

The progression of disease for an individual can be described mathematically as a stochastic process. The individual experiences a failure event when the disease path first reaches or crosses a critical disease level. This happening defines a failure event and a first hitting time or time-to-event, both of which are important in medical contexts. When the context involves explanatory variables then there is usually an interest in incorporating regression structures into the analysis and the methodology known as threshold regression comes into play. To date, most applications of threshold regression have been based on parametric families of stochastic processes. This paper presents a semiparametric form of threshold regression that requires the stochastic process to have only one key property, namely, stationary independent increments. As this property is frequently encountered in real applications, this model has potential for use in many fields. The mathematical underpinnings of this semiparametric approach for estimation and prediction are described. The basic data element required by the model is a pair of readings representing the observed change in time and the observed change in disease level, arising from either a failure event or survival of the individual to the end of the data record. An extension is presented for applications where the underlying disease process is unobservable but component covariate processes are available to construct a surrogate disease process. Threshold regression, used in combination with a data technique called Markov decomposition, allows the methods to handle longitudinal time-to-event data by uncoupling a longitudinal record into a sequence of single records. Computational aspects of the methods are straightforward. An array of simulation experiments that verify computational feasibility and statistical inference are reported in an online supplement. Case applications based on longitudinal observational data from The Osteoarthritis Initiative (OAI) study are presented to demonstrate the methodology and its practical use.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Aaron SD, Ramsay T, Vandemheen K, Whitmore GA (2010) A threshold regression model for recurrent exacerbations in chronic obstructive pulmonary disease. J Clin Epidemiol 63:1324–1331

    Article  Google Scholar 

  • Aaron SD, Stephenson AL, Cameron DW, Whitmore GA (2015) A statistical model to predict one-year risk of death in patients with cystic fibrosis. J Clin Epidemiol 68:1336–1345

    Article  Google Scholar 

  • Angst F, Benz T, Lehmann S et al (2018) Multidimensional minimal clinically important differences in knee osteoarthritis after comprehensive rehabilitation: a prospective evaluation from the Bad Zurzach osteoarthritis study. RMD Open 4:e000685. https://doi.org/10.1136/rmdopen-2018-000685

    Article  Google Scholar 

  • Caroni C (2017) First hitting time regression models: lifetime data analysis based on underlying stochastic processes. Wiley, Hoboken, NJ, USA

    Book  MATH  Google Scholar 

  • Chen Y, Lawrence J, Lee M-LT (2022) Group sequential design for randomized trials using first hitting time model. Stat Med. https://doi.org/10.1002/sim.9360

    Article  MathSciNet  Google Scholar 

  • De Bin R, Stikbakke VG (2022) A boosting first-hitting-time model for survival analysis in high-dimensional settings. Lifetime Data Anal. https://doi.org/10.1007/s10985-022-09553-9

    Article  Google Scholar 

  • He X, Lee M-LT, Whitmore GA, Loo GY, Hochberg M (2015) A model for time to fracture with a shock stream superimposed on progressive degradation: the study of osteoporotic fractures. Stat Med 34:652–63. https://doi.org/10.1002/sim.6356

    Article  MathSciNet  Google Scholar 

  • Hellier J, Emsley R, Pickles A (2020) Estimating dose-response for time to remission with instrumental variable adjustment: the obscuring effects of drug titration in genome based therapeutic drugs for depression trial (GENDEP): clinical trial data. Trials 21:1–11. https://doi.org/10.1186/s13063-019-3810-9

    Article  Google Scholar 

  • Lancaster T (1972) A stochastic model for the duration of a strike. J Royal Statist Soc Ser A 135:257–271

    Article  Google Scholar 

  • Lawless J, Crowder M (2004) Covariates and random effects in a gamma process model with application to degradation and failure. Lifetime Data Anal 10:213–227

    Article  MathSciNet  MATH  Google Scholar 

  • Lee M-LT, DeGruttola V, Schoenfeld D (2000) A model for markers and latent health status. J Royal Stat Soc Ser B 62:747–762

    Article  MathSciNet  MATH  Google Scholar 

  • Lee M-LT, Whitmore GA (2006) Threshold regression for survival analysis: modeling event times by a stochastic process reaching a boundary. Stat Sci 21:501–513

    Article  MathSciNet  MATH  Google Scholar 

  • Lee M-LT, Chang M, Whitmore GA (2008) A threshold regression mixture model for assessing treatment efficacy in a multiple myeloma clinical trial. J Biopharm Stat 18:1136–1149

    Article  MathSciNet  Google Scholar 

  • Lee M-LT, Whitmore GA, Laden F, Hart JE, Garshick E (2009) A case-control study relating railroad worker mortality to diesel exhaust exposure using a threshold regression model. J Stat Plann Inferences 139:1633–1642

    Article  MathSciNet  MATH  Google Scholar 

  • Lee M-LT, Whitmore GA (2010) Proportional hazards and threshold regression: their theoretical and practical connections. Lifetime Data Anal 16:196–214. https://doi.org/10.1007/s10985-009-9138-0

    Article  MathSciNet  MATH  Google Scholar 

  • Lee M-LT, Whitmore GA, Rosner BA (2010) Threshold regression for survival data with time-varying covariates. Stat Med 29:896–905

    Article  MathSciNet  Google Scholar 

  • Li J, Huang Z, Ma S, Lee M-LT (2016) Collective versus individual effects in survival analysis of multiple failures. Scand J Stat 43:543557. https://doi.org/10.1111/sjos.12190

    Article  MathSciNet  MATH  Google Scholar 

  • Li Y, Xiao T, Liao D, Lee M-LT (2018) Using threshold regression to analyze survival data from complex surveys: with application to mortality linked NHANES III phase II genetic data. Stat Med 37:1162–1177. https://doi.org/10.1002/sim.7575

    Article  MathSciNet  Google Scholar 

  • Meeker WQ, Escobar LA (1998) Stat Methods Reliab Data. Wiley-Interscience, Hoboken, NJ, USA

    Google Scholar 

  • Pennell ML, Whitmore GA, Lee M-LT (2010) Bayesian random effects threshold regression with application to survival data with nonproportional hazards. Biostatistics 11:111–126

    Article  MATH  Google Scholar 

  • Race JA, Pennell ML (2021) Semi-parametric survival analysis via Dirichlet process mixtures of the first hitting time model. Lifetime Data Anal 27:177–194

    Article  MathSciNet  MATH  Google Scholar 

  • Stanojevic S, Sykes J, Stephenson AL, Aaron SD, Whitmore GA (2019) Development and external validation of 1- and 2- year mortality prediction models in cystic fibrosis. Eur Respir J 54:1900224. https://doi.org/10.1183/13993003.00224-2019

    Article  Google Scholar 

  • Sæbø S, Almøy T, Aastveit AH (2005a) Disease resistance modelled as first-passage times of genetically dependent stochastic processes. Appl Statist 54:273–285

    MathSciNet  MATH  Google Scholar 

  • Sæbø S, Almøy T, Heringstad B, Klemetsdal G, Aastveit AH (2005b) Genetic evaluation of mastitis resistance using a first-passage time model for Wiener processes for analysis of time to first treatment. J Dairy Sci 88:834–841

    Article  Google Scholar 

  • Takumi S, Ma T, Li G, Chen YQ, Lee M-LT (2020) Variable selection in threshold regression model with applications to HIV drug adherence data. Stat Biosci. https://doi.org/10.1007/s12561-020-09284-1

    Article  Google Scholar 

  • Wald A (1944) On cumulative sums of random variables. Ann Math Stat 15:283–296

    Article  MathSciNet  MATH  Google Scholar 

  • Whitmore GA (1995) Estimating degradation by a Wiener diffusion process subject to measurement error. Lifetime Data Anal 1:307–319

    Article  MATH  Google Scholar 

  • Whitmore GA, Schenkelberg F (1997) Modelling accelerated degradation data using Wiener diffusion with a time scale transformation. Lifetime Data Anal 3:1–19

    Article  MATH  Google Scholar 

  • Whitmore GA, Crowder MJ, Lawless JF (1998) Failure inference from a marker process based on a bivariate Wiener model. Lifetime Data Anal 4:229–251

    Article  MATH  Google Scholar 

  • Whitmore GA, Su Y (2007) Modeling low birth weights using threshold regression: results for U.S. birth data. Lifetime Data Anal 13:161–190

    Article  MathSciNet  MATH  Google Scholar 

  • Yu Z, Tu W, Lee M-LT (2009) A semiparametric threshold regression analysis of sexually transmitted infections in adolescent women. Stat Med 2009(28):3029–3042

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank the LIDA Associate Editor and referees for their thorough reports. Their comments and suggestions were extremely helpful in uncovering errors in the initial manuscript, encouraging us to strengthen theoretical and practical arguments, and generally improving the presentation of our research work. The authors also thank the Osteoarthritis Initiative (OAI) for granting access to their data. The OAI is a public-private partnership comprised of contracts N01-AR-2-2258, N01-AR-2-2259, N01-AR-2-2260, N01-AR-2-2261, and N01-AR-2-2262 funded by the National Institutes of Health (NIH), a branch of the Department of Health and Human Services. Private funding partners for OAI include Merck Research Laboratories, Novartis Pharmaceuticals Corporation, GlaxoSmithKline, and Pfizer, Inc. Private-sector funding for the OAI is managed by the Foundation for the NIH. The contents of this article do not necessarily reflect the opinions or views of the OAI Study Investigators, the NIH, or the private funding partners of the OAI. The authors of this article are not part of the OAI investigative team. Research of M-LT Lee was partially supported by NIH grant R01EY022445.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. A. Whitmore.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 2202 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, ML.T., Whitmore, G.A. Semiparametric predictive inference for failure data using first-hitting-time threshold regression. Lifetime Data Anal 29, 508–536 (2023). https://doi.org/10.1007/s10985-022-09583-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-022-09583-3

Keywords

Navigation