Abstract
The recent wide adoption of electronic medical records (EMRs) presents great opportunities and challenges for data mining. The EMR data are largely temporal, often noisy, irregular and high dimensional. This paper constructs a novel ordinal regression framework for predicting medical risk stratification from EMR. First, a conceptual view of EMR as a temporal image is constructed to extract a diverse set of features. Second, ordinal modeling is applied for predicting cumulative or progressive risk. The challenges are building a transparent predictive model that works with a large number of weakly predictive features, and at the same time, is stable against resampling variations. Our solution employs sparsity methods that are stabilized through domain-specific feature interaction networks. We introduces two indices that measure the model stability against data resampling. Feature networks are used to generate two multivariate Gaussian priors with sparse precision matrices (the Laplacian and Random Walk). We apply the framework on a large short-term suicide risk prediction problem and demonstrate that our methods outperform clinicians to a large margin, discover suicide risk factors that conform with mental health knowledge, and produce models with enhanced stability.
Similar content being viewed by others
Notes
This is known as the proportional odds model.
References
Abraham G, Kowalczyk A, Loi S, Haviv I, Zobel J (2010) Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context. BMC Bioinform 11(277)
Allen MH, Abar BW, McCormick M, Barnes DH, Haukoos J, Garmel GM, Boudreaux ED (2013) Screening for suicidal ideation and attempts among emergency department medical patients: instrument and results from the psychiatric emergency research collaboration. Suicide Life-Threat Behav 43(3):313–323
Austin PC, Tu JV (2004) Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. J Clin Epidemiol 57(11):1138–1146
Baccianella S, Esuli A, Sebastiani F (2009) Evaluation measures for ordinal regression. In: Intelligent systems design and applications, 2009. ISDA’09. Ninth international conference on. IEEE, pp 283–287
Bender R, Grouven U (1997) Ordinal logistic regression in medical research. J R Coll Phys Lond 31(5):546–551
Bi J, Bennett K, Embrechts M, Breneman C, Song M (2003) Dimensionality reduction via sparse support vector machines. J Mach Learn Res 3:1229–1243
Blasco-Fontecilla H, Delgado-Gomez D, Ruiz-Hernandez D, Aguado D, Baca-Garcia E, Lopez-Castroman J (2012) Combining scales to assess suicide risk. J Psychiatr Res 46(10):1272–1277
Borges G, Nock MK, Abad JMH, Hwang I, Sampson NA, Alonso J, Andrade LH, Angermeyer MC, Beautrais A, Bromet E et al (2010) Twelve month prevalence of and risk factors for suicide attempts in the WHO World Mental Health Surveys. J Clin Psychiatry 71(12):1617–1628
Bousquet O, Elisseeff A (2002) Stability and generalization. J Mach Learn Res 2:499–526
Brown G, Beck A, Steer R, Grisham J (2000) Risk factors for suicide in psychiatric outpatients: a 20-year prospective study. J Consult Clin Psychol 68(3):371–377
Cardoso J, da Costa J (2007) Learning to classify ordinal data: the data replication method. J Mach Learn Res 8:1393–1429
Chu W, Ghahramani Z (2006) Gaussian processes for ordinal regression. J Mach Learn Res 6:1019–1041
Chu W, Keerthi S (2007) Support vector ordinal regression. Neural Comput 19(3):792–815
Crammer K, Singer Y (2002) Pranking with ranking. In: Advances in neural information processing systems, vol. 14, pp 641–647
Da Cruz D, Pearson A, Saini P, Miles C, While D, Swinson N, Williams A, Shaw J, Appleby L, Kapur N (2011) Emergency department contact prior to suicide in mental health patients. Emerg Med J 28(6):467–471
Delgado-Gomez D, Blasco-Fontecilla H, Alegria AA, Legido-Gil T, Artes-Rodriguez A, Baca-Garcia E (2011) Improving the accuracy of suicide attempter classification. Artif Intell Med 52(3):165–168
Donoho DL, Elad M, Temlyakov VN (2006) Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans Inf Theory 52(1):6–18
Efron B, Tibshirani R (1986) Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci 1(1):54–75
Elixhauser A, Steiner C, Harris DR, Coffey RM (1998) Comorbidity measures for use with administrative data. Med Care 36(1):8–27
Fei H, Quanz B, Huan J (2010) Regularization and feature selection for networked features. In: Proceedings of the 19th ACM international conference on information and knowledge management. ACM, pp 1893–1896
Friedman JH, Popescu BE (2008) Predictive learning via rule ensembles. Ann Appl Stat 2(3):916–954
Gonda X, Pompili M, Serafini G, Montebovi F, Campi S, Dome P, Duleba T, Girardi P, Rihmer Z (2012) Suicidal behavior in bipolar disorder: epidemiology, characteristics and major risk factors. J Affect Disord
Gulgezen G, Cataltepe Z, Yu L (2009) Stable and accurate feature selection. In: Machine learning and knowledge discovery in databases. Lecture Notes in Computer Science, vol 5781, Chap 47. Springer, pp 455–468. doi:10.1007/978-3-642-04180-8_47.
Haw C, Hawton K (2011) Living alone and deliberate self-harm: a case-control study of characteristics and risk factors. Soc Psychiatry Psychiatr Epidemiol 46(11):1115–1125
Herbrich R, Graepel T, Obermayer K (1999) Large margin rank boundaries for ordinal regression. Advances in neural information processing systems, pp 115–132
Huang J, Zhang T, Metaxas D (2011) Learning with structured sparsity. J Mach Learn Res 12:3371–3412
Jensen PB, Jensen LJ, Brunak S (2012) Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 13(6):395–405
Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst 12(1):95–116
Kuncheva LI (2007) A stability index for feature selection. In: Artificial intelligence and applications, pp 421–427
Large M, Nielssen O (2010) Suicide in Australia: meta-analysis of rates and methods of suicide between 1988 and 2007. Med J Aust 192(8):432–437
Large M, Nielssen O (2012) Suicide is preventable but not predictable. Australas Psychiatry 20(6):532–533
Large M, Ryan C, Nielssen O (2011) The validity and utility of risk assessment for inpatient suicide. Australas Psychiatry 19(6):507–512
Lausser L, Müssel C, Maucher M, Kestler HA (2013) Measuring and visualizing the stability of biomarker selection techniques. Comput Stat 28(1):51–65
Li C, Li H (2008) Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics 24(9):1175–1182
Li L, Lin H-T (2006) Ordinal regression by extended binary classification. In: Advances in neural information processing systems. pp 865–872
Luo D, Ding C, Huang H (2012) Toward structural sparsity: an explicit \(\ell \_{2}/ \ell \_{0}\) approach. Knowl Inf Syst 36(2):411–438
Luo D, Wang F, Sun J, Markatou M, Hu J, Ebadollahi S (2012) SOR: scalable orthogonal regression for non-redundant feature selection and its healthcare applications. In: SIAM data mining conference
Luoma JB, Martin CE, Pearson JL (2002) Contact with mental health and primary care providers before suicide: a review of the evidence. Am J Psychiatry 159(6):909–916
Martin-Fumadó C, Hurtado-Ruíz G (2012) Clinical and epidemiological aspects of suicide in patients with schizophrenia. Actas Esp Psiquiatr 40(6):333–345
McCullah P (1980) Regression models for ordinal data. J R Stat Soc Ser B (Methodological) 42(2):109–142
Meinshausen N, Bühlmann P (2010) Stability selection. J R Stat Soc Ser B (Statistical Methodology) 72(4):417–473
Miguel Hernández-Lobato J, Hernández-Lobato D, Suárez A (2011) Network-based sparse Bayesian classification. Pattern Recognit 44(4):886–900
Modai I, Kurs R, Ritsner M, Oklander S, Silver H, Segal A, Goldberg I, Mendel S (2002) Neural network identification of high-risk suicide patients. Inform Health Soc Care 27(1):39–47
Morris-Yates A (2000) Mapping ICD-10 codes to mental health diagnostic groups. In: The SPGPPS national model for data collection and analysis. Commonwealth of Australia. Retrieved from http://www.health.gov.au, 09/09/2013, Ch. Appendix 11, pp 316–322
Nock MK, Green JG, Hwang I, McLaughlin KA, Sampson NA, Zaslavsky AM, Kessler RC (2013) Prevalence, correlates, and treatment of lifetime suicidal behavior among adolescentsresults from the national comorbidity survey replication adolescent supplementlifetime suicidal behavior among adolescents. JAMA Psychiatry 70(3):300–310
Oquendo M, Baca-Garcia E, Artes-Rodriguez A, Perez-Cruz F, Galfalvy H, Blasco-Fontecilla H, Madigan D, Duan N (2012) Machine learning and data mining: strategies for hypothesis generation. Mol Psychiatry 17(10):956–959
Park MY, Hastie T, Tibshirani R (2007) Averaged gene expressions for regression. Biostatistics 8(2):212–227
Pestian J, Nasrallah H, Matykiewicz P, Bennett A, Leenaars A (2010) Suicide note classification using natural language processing: a content analysis. Biomed Inform Insights 2010(3):19–28
Poggio T, Rifkin R, Mukherjee S, Niyogi P (2004) General conditions for predictivity in learning theory. Nature 428(6981):419–422
Pokorny AD (1983) Prediction of suicide in psychiatric patients: report of a prospective study. Arch Gen Psychiatry 40(3):249–257
Qin P, Webb R, Kapur N, Sørensen HT (2013) Hospitalization for physical illness and risk of subsequent suicide: a population study. J Intern Med 273(1):48–58
Ruiz F, Valera I, Blanco C, Perez-Cruz F (2012) Bayesian nonparametric modeling of suicide attempts. Advances in neural information processing systems 25, pp 1862–1870
Ryan C, Large M (2012) Suicide risk assessment: where are we now? Med J Aust 198(9):462–463
Ryan C, Nielssen O, Paton M, Large M (2010) Clinical decisions in psychiatry should not be based on risk assessment. Australas Psychiatry 18(5):398–403
Sandler T, Blitzer J, Talukdar PP, Ungar LH (2008) Regularized learning with networks of features. In: Advances in neural information processing systems, pp 1401–1408
Somol P, Novovicova J (2010) Evaluating stability and comparing output of feature selectors that optimize feature subset cardinality. IEEE Trans Pattern Anals Mach Intell 32(11):1921–1939
Soneson C, Fontes M (2012) A framework for list representation, enabling list stabilization through incorporation of gene exchangeabilities. Biostatistics 13(1):129–141
Steyerberg EW (2009) Clinical prediction models: a practical approach to development, validation, and updating. Springer, Berlin
Sun B-Y, Li J, Wu DD, Zhang X-M, Li W-B (2010) Kernel discriminant learning for ordinal regression. IEEE Trans Knowl Data Eng 22(6):906–910
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodological) 58(1):267–288
Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused lasso. J R Stat Soc Ser B (Statistical Methodology) 67(1):91–108
Tran T, Phung D, Luo W, Harvey R, Berk M, Venkatesh S (2013) An integrated framework for suicide risk prediction. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1410–1418
Tran T, Phung D, Venkatesh S (2012) Sequential decision approach to ordinal preferences in recommender systems. In: Proceedings of the 26th AAAI conference. Toronto, ON, Canada
Tutz G (1991) Sequential models in categorical regression. Comput Stat Data Anal 11(3):275–295
Wang F, Lee N, Hu J, Sun J, Ebadollahi S (2012) Towards heterogeneous temporal clinical event pattern discovery: a convolutional approach. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 453–461
Xu H, Caramanis C, Mannor S (2012) Sparse algorithms are not stable: a no-free-lunch theorem. IEEE Trans Pattern Anal Mach Intell 34(1):187–193
Ye J, Liu J (2012) Sparse methods for biomedical data. ACM SIGKDD Explor Newsl 14(1):4–15
Yu L, Ding C, Loscalzo S (2008) Stable feature selection via dense feature groups. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 803–811
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Statistical Methodology) 68(1):49–67
Zhou J, Liu J, Narayan VA, Ye J (2013) Modeling disease progression via multi-task learning. NeuroImage 78:233–248
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B (Statistical Methodology) 67(2):301–320
Acknowledgments
We thank Ross Arblaster and Ann Larkins for helping data collections, Paul Cohen for providing management support for the project, Richard Harvey for risk stratification, Michael Berk and Richard Kennedy for valuable opinions and anonymous reviewers for helpful comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tran, T., Phung, D., Luo, W. et al. Stabilized sparse ordinal regression for medical risk stratification. Knowl Inf Syst 43, 555–582 (2015). https://doi.org/10.1007/s10115-014-0740-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-014-0740-4