Abstract
As of 2020, 807,920 individuals in the U.S. had end-stage kidney disease (ESKD) with about 70% of patients on dialysis, a life-sustaining treatment. Dialysis patients experience high mortality rates, where frequent hospitalizations are a major contributor to morbidity and mortality. There is growing interest in identifying the risk factors for the correlated outcomes of hospitalization and mortality among dialysis patients across the U.S. Utilizing national data from the United States Renal Data System (USRDS), we propose a novel multivariate varying coefficient spatiotemporal model to study the time-dynamic effects of risk factors (e.g., urbanicity and area deprivation index) on the multivariate outcome of hospitalization and mortality rates, as a function of time on dialysis. While capturing time-varying effects of risk factors on the mean, the proposed model also incorporates spatiotemporal patterns of the residuals for efficient inference. Estimation is based on the fusion of functional principal component analysis and Markov Chain Monte Carlo techniques, following basis expansions of the varying coefficient functions and multivariate Karhunen–Loéve expansion of region-specific random deviations. The finite sample performance of the proposed method is studied through extensive simulations. Novel applications to the USRDS data highlight significant risk factors of hospitalizations and mortality as well as characterizing time periods on dialysis and spatial locations across U.S. with elevated hospitalization and mortality risks.
Similar content being viewed by others
Data Availability Statement
The release of the data used in this paper is governed by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) through the USRDS Coordinating Center. The data can be requested from the USRDS through a data use agreement.
References
USRDS (2022) United States renal data system 2022 annual data report: ‘Epidemiology of Kidney Disease in the United States’. National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, Maryland
Bello AK, Okpechi IG, Osman MA, Cho Y, Htay H, Jha V, Wainstein M, Johnson DW (2022) Epidemiology of haemodialysis outcomes. Nat Rev Nephrol 18(6):378–395. https://doi.org/10.1038/s41581-022-00542-7
Hickson LJ, Thorsteinsdottir B, Ramar P, Reinalda MS, Crowson CS, Williams AW, Albright RC, Onuigbo MA, Rule AD, Shah ND (2018) Hospital readmission among new dialysis patients associated with young age and poor functional status. Nephron 139(1):1–2. https://doi.org/10.1159/000485985
Estes JP, Nguyen DV, Chen Y, Dalrymple LS, Rhee CM, Kalantar-Zadeh K et al (2018) Time-dynamic profiling with application to hospital readmission among patients on dialysis. Biometrics 74(4):1383–94. https://doi.org/10.1111/biom.12908
Li Y, Nguyen DV, Chen Y, Rhee CM, Kalantar-Zadeh K, Şentürk D (2018) Modeling time-varying effects of multilevel risk factors of hospitalizations in patients on dialysis. Stat. Med. 37(30):4707–4720. https://doi.org/10.1002/sim.7950
Estes JP, Nguyen DV, Dalrymple LS, Mu Y, Şentürk D (2016) Time-varying effect modeling with longitudinal data truncated by death: conditional models, interpretations, and inference. Stat. Med. 35(11):1834–1847. https://doi.org/10.1002/sim.6836
Noordzij M, Jager KJ (2014) Increased mortality early after dialysis initiation: a universal phenomenon. Kidney Int. 85(1):12–14. https://doi.org/10.1038/ki.2013.316
de Jager DJ, Grootendorst DC, Jager KJ, van Dijk PC, Tomas LM, Ansell D et al (2009) Cardiovascular and noncardiovascular mortality among patients starting dialysis. Jama 302(16):1782–1789. https://doi.org/10.1001/jama.2009.1488
Li Y, Nguyen DV, Banerjee S, Rhee CM, Kalantar-Zadeh K, Kürüm E et al (2021) Multilevel modeling of spatially nested functional data: spatiotemporal patterns of hospitalization rates in the US dialysis population. Stat. Med. 40(17):3937–3952. https://doi.org/10.1002/sim.9007
Li Y, Nguyen DV, Kürüm E, Rhee CM, Banerjee S, Şentürk D (2022) Multilevel varying coefficient spatiotemporal model. Stat 11(1):e438. https://doi.org/10.1002/sta4.438
Erickson KF, Zhao B, Niu J, Winkelmayer WC, Bhattacharya J, Chertow GM et al (2019) Association of hospitalization and mortality among patients initiating dialysis with hemodialysis facility ownership and acquisitions. JAMA Netw Open 2(5):e193987. https://doi.org/10.1001/jamanetworkopen.2019.3987
Zhu H, Li R, Kong L (2012) Multivariate varying coefficient model for functional responses. Ann Stat 40(5):2634–2666. https://doi.org/10.1214/12-AOS1045SUPP
Kürüm E, Li R, Shiffman S, Yao W (2016) Time-varying coefficient models for joint modeling binary and continuous outcomes in longitudinal data. Stat Sin 26(3):979–1000. https://doi.org/10.5705/ss.2014.213
Zhang F, Li R, Lian H, Bandyopadhyay D (2021) Sparse reduced-rank regression for multivariate varying-coefficient models. J Stat Comput Simul 91(4):752–767. https://doi.org/10.1080/00949655.2020.1829622
Cai J, Fan J, Zhou H, Zhou Y (2007) Hazard models with varying coefficients for multivariate failure time data. Ann Stat 35(1):324–354. https://doi.org/10.1214/009053606000001145
He K, Lian H, Ma S, Huang JZ (2018) Dimensionality reduction and variable selection in multivariate varying-coefficient models with a large number of covariates. J Am Stat Assoc 113(522):746–754. https://doi.org/10.1080/01621459.2017.1285774
Yee TW, Wild CJ (1996) Vector generalized additive models. J R Stat Soc: Ser B (Methodol) 58(3):481–493. https://doi.org/10.1111/j.2517-6161.1996.tb02095.x
Wild CJ, Yee TW (1996) Additive extensions to generalized estimating equation methods. J R Stat Soc: Ser B (Methodol) 58(4):711–725. https://doi.org/10.1111/j.2517-6161.1996.tb02110.x
Yee TW, Mackenzie M (2002) Vector generalized additive models in plant ecology. Ecol Model 157(2–3):141–156. https://doi.org/10.1016/S0304-3800(02)00192-8
Guo Y, Sun D, Sun J (2022) Inference of a time-varying coefficient regression model for multivariate panel count data. J Multivar Anal 192:105047. https://doi.org/10.1016/j.jmva.2022.105047
Gelfand AE, Banerjee S, Gamerman D (2005) Spatial process modelling for univariate and multivariate dynamic spatial data. Environmetrics 16(5):465–479. https://doi.org/10.1002/env.715
Congdon P (2004) A multivariate model for spatio-temporal health outcomes with an application to suicide mortality. Geogr Anal 36(3):234–258. https://doi.org/10.1111/j.1538-4632.2004.tb01134.x
Cheng W, Gill GS, Dasu R, Xie M, Jia X, Zhou J (2017) Comparison of multivariate Poisson lognormal spatial and temporal crash models to identify hot spots of intersections based on crash types. Accid Anal Prev 99:330–341. https://doi.org/10.1016/j.aap.2016.11.022
Zhang S, Sun D, He CZ, Schootman M (2006) A Bayesian semi-parametric model for colorectal cancer incidences. Stat Med 25(2):285–309. https://doi.org/10.1002/sim.2221
Hepler SA, Waller LA, Kline DM (2021) A multivariate spatiotemporal change-point model of opioid overdose deaths in Ohio. Ann Appl Stat 15(3):1329–1342. https://doi.org/10.1214/20-aoas1415
Baer DR, Lawson AB, Joseph JE (2021) Joint space-time Bayesian disease mapping via quantification of disease risk association. Stat Methods Med Res 30(1):35–61. https://doi.org/10.1177/0962280220938975
Qian Q, Nguyen DV, Telesca D, Kürüm E, Rhee CM, Banerjee S et al (2023) Multivariate spatiotemporal functional principal component analysis for modeling hospitalization and mortality rates in the dialysis population. Biostatistics (In press). https://doi.org/10.1093/biostatistics/kxad013
Quick H, Banerjee S, Carlin BP (2013) Modeling temporal gradients in regionally aggregated California asthma hospitalization data. Ann Appl Stat 7(1):154–176. https://doi.org/10.1214/12-AOAS600
Ramsay JO, Silverman BW (2005) Functional data analysis. Springer, New York
Happ C, Greven S (2018) Multivariate functional principal component analysis for data observed on different (dimensional) domains. J Am Stat Assoc 113(522):649–659. https://doi.org/10.1080/01621459.2016.1273115
Banerjee S, Carlin BP, Gelfand AE (2003) Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC, Boca Raton
Brook D (1964) On the distinction between the conditional probability and the joint probability approaches in the specification of nearest-neighbour systems. JBiometrika 51(3/4):481–483. https://www.jstor.org/stable/2334154
Jin X, Banerjee S, Carlin BP (2007) Order-free co-regionalized areal data models with application to multiple-disease mapping. J R Stat Soc: Ser B (Stat Methodol) 69(5):817–838. https://doi.org/10.1111/j.1467-9868.2007.00612.x
Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Cambridge University Press, Cambridge
Crainiceanu C, Ruppert D, Wand MP (2005) Bayesian analysis for penalized spline regression using WinBUGS. J Stat Softw 14(14):1–24. https://doi.org/10.18637/jss.v014.i14
Crainiceanu CM, Ruppert D, Carroll RJ, Joshi A, Goodner B (2007) Spatially adaptive Bayesian penalized splines with heteroscedastic errors. J Comput Gr Stat 16(2):265–88. https://doi.org/10.1198/106186007X208768
Jacques J, Preda C (2014) Model-based clustering for multivariate functional data. Comput Stat Data Anal 71:92–106. https://doi.org/10.1016/j.csda.2012.12.004
Chiou JM, Chen YT, Yang YF (2014) Multivariate functional principal component analysis: a normalization approach. Stat Sin 24(4):1571–1596. https://doi.org/10.5705/ss.2013.305
Kind AJ, Buckingham WR (2018) Making neighborhood-disadvantage metrics accessible the neighborhood atlas. N Engl J Med 378(26):2456–2458. https://doi.org/10.1056/NEJMp1802313
Cox DD (1993) An analysis of Bayesian inference for nonparametric regression. Ann Stat 21(2):903–923. https://doi.org/10.1214/aos/1176349157
Krivobokova T, Kneib T, Claeskens G (2010) Simultaneous confidence bands for penalized spline estimators. J Am Stat Assoc 105(490):852–863. https://doi.org/10.1198/jasa.2010.tm09165
Acknowledgements
This study was supported by a grant from the National Institute of Diabetes and Digestive and Kidney Diseases (R01 DK092232- DS, DVN, EK, SB, CMR, QQ, and YL). The data reported here have been supplied by the United States Renal Data System (USRDS). The interpretation and reporting of these data are the responsibility of the author(s) and in no way should be seen as an official policy or interpretation of the U.S. government.
Author information
Authors and Affiliations
Corresponding author
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary Information
The supplementary material for this article, including referenced appendices, is available online. The R code and documentation for implementing the proposed MV-VCSTM on simulated datasets are provided on Github at https://github.com/dsenturk/MV-VCSTM. (pdf 432KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Qian, Q., Nguyen, D.V., Kürüm, E. et al. Multivariate Varying Coefficient Spatiotemporal Model. Stat Biosci (2024). https://doi.org/10.1007/s12561-024-09419-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12561-024-09419-8