Stabilizing Sparse Cox Model Using Statistic and Semantic Structures in Electronic Medical Records
Stability in clinical prediction models is crucial for transferability between studies, yet has received little attention. The problem is paramount in high dimensional data, which invites sparse models with feature selection capability. We introduce an effective method to stabilize sparse Cox model of time-to-events using statistical and semantic structures inherent in Electronic Medical Records (EMR). Model estimation is stabilized using three feature graphs built from (i) Jaccard similarity among features (ii) aggregation of Jaccard similarity graph and a recently introduced semantic EMR graph (iii) Jaccard similarity among features transferred from a related cohort. Our experiments are conducted on two real world hospital datasets: a heart failure cohort and a diabetes cohort. On two stability measures – the Consistency index and signal-to-noise ratio (SNR) – the use of our proposed methods significantly increased feature stability when compared with the baselines.
KeywordsElectronic Medical Record Consistency Index Transfer Learning Jaccard Index Semantic Structure
Unable to display preview. Download preview PDF.
- 3.Sandler, T., Blitzer, J., Talukdar, P.P., Ungar, L.H.: Regularized learning with networks of features. In: Advances in Neural Information Processing Systems 21. Curran Associates, Inc., pp. 1401–1408 (2009)Google Scholar
- 5.Tran, T., Phung, D., Luo, W., Venkatesh, S.: Stabilized sparse ordinal regression for medical risk stratification. Knowledge and Information Systems, 1–28 (2014)Google Scholar
- 12.Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for cox’s proportional hazards model via coordinate descent. Journal of Statistical Software 39, 1–13 (2011)Google Scholar
- 13.Vinzamuri, B., Reddy, C.: Cox regression with correlation based regularization for electronic health records. In: ICDM, pp. 757–766 (2013)Google Scholar
- 19.Tran, T., Phung, D.Q., Luo, W., Harvey, R., Berk, M., Venkatesh, S.: An integrated framework for suicide risk prediction. In: KDD, 1410–1418 (2013)Google Scholar
- 20.Kuncheva, L.I.: A stability index for feature selection. In: Artificial Intelligence and Applications, 421–427 (2007)Google Scholar
- 21.Vinzamuri, B., Li, Y., Reddy, C.K.: Active learning based survival regression for censored data. In: CIKM 2014, 241–250. ACM, New York (2014)Google Scholar
- 22.Bilal, E., Dutkowski, J., Guinney, J., Jang, I.S., Logsdon, B.A., Pandey, G., Sauerwine, B.A., Shimoni, Y., Vollan, H.K.M., Mecham, B.H., et al.: Improving breast cancer survival analysis through competition-based multidimensional modeling. PLoS Computational Biology 9, e1003047 (2013)CrossRefGoogle Scholar