Stabilized sparse ordinal regression for medical risk stratification

Tran, Truyen; Phung, Dinh; Luo, Wei; Venkatesh, Svetha

doi:10.1007/s10115-014-0740-4

Stabilized sparse ordinal regression for medical risk stratification

Regular Paper
Published: 17 March 2014

Volume 43, pages 555–582, (2015)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Truyen Tran^1,2,
Dinh Phung¹,
Wei Luo¹ &
…
Svetha Venkatesh¹

844 Accesses
21 Citations
2 Altmetric
Explore all metrics

Abstract

The recent wide adoption of electronic medical records (EMRs) presents great opportunities and challenges for data mining. The EMR data are largely temporal, often noisy, irregular and high dimensional. This paper constructs a novel ordinal regression framework for predicting medical risk stratification from EMR. First, a conceptual view of EMR as a temporal image is constructed to extract a diverse set of features. Second, ordinal modeling is applied for predicting cumulative or progressive risk. The challenges are building a transparent predictive model that works with a large number of weakly predictive features, and at the same time, is stable against resampling variations. Our solution employs sparsity methods that are stabilized through domain-specific feature interaction networks. We introduces two indices that measure the model stability against data resampling. Feature networks are used to generate two multivariate Gaussian priors with sparse precision matrices (the Laplacian and Random Walk). We apply the framework on a large short-term suicide risk prediction problem and demonstrate that our methods outperform clinicians to a large margin, discover suicide risk factors that conform with mental health knowledge, and produce models with enhanced stability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Article Open access 19 December 2014

Machine and deep learning for longitudinal biomedical data: a review of methods and applications

Article Open access 05 August 2023

Deep learning for survival analysis: a review

Article Open access 19 February 2024

Notes

http://apps.who.int/classifications/icd10.
http://www.aihw.gov.au/procedures-data-cubes.
This is known as the proportional odds model.

References

Abraham G, Kowalczyk A, Loi S, Haviv I, Zobel J (2010) Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context. BMC Bioinform 11(277)
Allen MH, Abar BW, McCormick M, Barnes DH, Haukoos J, Garmel GM, Boudreaux ED (2013) Screening for suicidal ideation and attempts among emergency department medical patients: instrument and results from the psychiatric emergency research collaboration. Suicide Life-Threat Behav 43(3):313–323
Article Google Scholar
Austin PC, Tu JV (2004) Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. J Clin Epidemiol 57(11):1138–1146
Article Google Scholar
Baccianella S, Esuli A, Sebastiani F (2009) Evaluation measures for ordinal regression. In: Intelligent systems design and applications, 2009. ISDA’09. Ninth international conference on. IEEE, pp 283–287
Bender R, Grouven U (1997) Ordinal logistic regression in medical research. J R Coll Phys Lond 31(5):546–551
Google Scholar
Bi J, Bennett K, Embrechts M, Breneman C, Song M (2003) Dimensionality reduction via sparse support vector machines. J Mach Learn Res 3:1229–1243
MATH Google Scholar
Blasco-Fontecilla H, Delgado-Gomez D, Ruiz-Hernandez D, Aguado D, Baca-Garcia E, Lopez-Castroman J (2012) Combining scales to assess suicide risk. J Psychiatr Res 46(10):1272–1277
Article Google Scholar
Borges G, Nock MK, Abad JMH, Hwang I, Sampson NA, Alonso J, Andrade LH, Angermeyer MC, Beautrais A, Bromet E et al (2010) Twelve month prevalence of and risk factors for suicide attempts in the WHO World Mental Health Surveys. J Clin Psychiatry 71(12):1617–1628
Article Google Scholar
Bousquet O, Elisseeff A (2002) Stability and generalization. J Mach Learn Res 2:499–526
MATH MathSciNet Google Scholar
Brown G, Beck A, Steer R, Grisham J (2000) Risk factors for suicide in psychiatric outpatients: a 20-year prospective study. J Consult Clin Psychol 68(3):371–377
Article Google Scholar
Cardoso J, da Costa J (2007) Learning to classify ordinal data: the data replication method. J Mach Learn Res 8:1393–1429
MATH MathSciNet Google Scholar
Chu W, Ghahramani Z (2006) Gaussian processes for ordinal regression. J Mach Learn Res 6:1019–1041
MathSciNet Google Scholar
Chu W, Keerthi S (2007) Support vector ordinal regression. Neural Comput 19(3):792–815
Article MATH MathSciNet Google Scholar
Crammer K, Singer Y (2002) Pranking with ranking. In: Advances in neural information processing systems, vol. 14, pp 641–647
Da Cruz D, Pearson A, Saini P, Miles C, While D, Swinson N, Williams A, Shaw J, Appleby L, Kapur N (2011) Emergency department contact prior to suicide in mental health patients. Emerg Med J 28(6):467–471
Article Google Scholar
Delgado-Gomez D, Blasco-Fontecilla H, Alegria AA, Legido-Gil T, Artes-Rodriguez A, Baca-Garcia E (2011) Improving the accuracy of suicide attempter classification. Artif Intell Med 52(3):165–168
Article Google Scholar
Donoho DL, Elad M, Temlyakov VN (2006) Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans Inf Theory 52(1):6–18
Article MATH MathSciNet Google Scholar
Efron B, Tibshirani R (1986) Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci 1(1):54–75
Article MathSciNet Google Scholar
Elixhauser A, Steiner C, Harris DR, Coffey RM (1998) Comorbidity measures for use with administrative data. Med Care 36(1):8–27
Article Google Scholar
Fei H, Quanz B, Huan J (2010) Regularization and feature selection for networked features. In: Proceedings of the 19th ACM international conference on information and knowledge management. ACM, pp 1893–1896
Friedman JH, Popescu BE (2008) Predictive learning via rule ensembles. Ann Appl Stat 2(3):916–954
Article MATH MathSciNet Google Scholar
Gonda X, Pompili M, Serafini G, Montebovi F, Campi S, Dome P, Duleba T, Girardi P, Rihmer Z (2012) Suicidal behavior in bipolar disorder: epidemiology, characteristics and major risk factors. J Affect Disord
Gulgezen G, Cataltepe Z, Yu L (2009) Stable and accurate feature selection. In: Machine learning and knowledge discovery in databases. Lecture Notes in Computer Science, vol 5781, Chap 47. Springer, pp 455–468. doi:10.1007/978-3-642-04180-8_47.
Haw C, Hawton K (2011) Living alone and deliberate self-harm: a case-control study of characteristics and risk factors. Soc Psychiatry Psychiatr Epidemiol 46(11):1115–1125
Article Google Scholar
Herbrich R, Graepel T, Obermayer K (1999) Large margin rank boundaries for ordinal regression. Advances in neural information processing systems, pp 115–132
Huang J, Zhang T, Metaxas D (2011) Learning with structured sparsity. J Mach Learn Res 12:3371–3412
MATH MathSciNet Google Scholar
Jensen PB, Jensen LJ, Brunak S (2012) Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 13(6):395–405
Article Google Scholar
Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst 12(1):95–116
Article Google Scholar
Kuncheva LI (2007) A stability index for feature selection. In: Artificial intelligence and applications, pp 421–427
Large M, Nielssen O (2010) Suicide in Australia: meta-analysis of rates and methods of suicide between 1988 and 2007. Med J Aust 192(8):432–437
Google Scholar
Large M, Nielssen O (2012) Suicide is preventable but not predictable. Australas Psychiatry 20(6):532–533
Article Google Scholar
Large M, Ryan C, Nielssen O (2011) The validity and utility of risk assessment for inpatient suicide. Australas Psychiatry 19(6):507–512
Article Google Scholar
Lausser L, Müssel C, Maucher M, Kestler HA (2013) Measuring and visualizing the stability of biomarker selection techniques. Comput Stat 28(1):51–65
Article MATH Google Scholar
Li C, Li H (2008) Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics 24(9):1175–1182
Article Google Scholar
Li L, Lin H-T (2006) Ordinal regression by extended binary classification. In: Advances in neural information processing systems. pp 865–872
Luo D, Ding C, Huang H (2012) Toward structural sparsity: an explicit \(\ell \_{2}/ \ell \_{0}\) approach. Knowl Inf Syst 36(2):411–438
Article Google Scholar
Luo D, Wang F, Sun J, Markatou M, Hu J, Ebadollahi S (2012) SOR: scalable orthogonal regression for non-redundant feature selection and its healthcare applications. In: SIAM data mining conference
Luoma JB, Martin CE, Pearson JL (2002) Contact with mental health and primary care providers before suicide: a review of the evidence. Am J Psychiatry 159(6):909–916
Article Google Scholar
Martin-Fumadó C, Hurtado-Ruíz G (2012) Clinical and epidemiological aspects of suicide in patients with schizophrenia. Actas Esp Psiquiatr 40(6):333–345
Google Scholar
McCullah P (1980) Regression models for ordinal data. J R Stat Soc Ser B (Methodological) 42(2):109–142
Google Scholar
Meinshausen N, Bühlmann P (2010) Stability selection. J R Stat Soc Ser B (Statistical Methodology) 72(4):417–473
Article Google Scholar
Miguel Hernández-Lobato J, Hernández-Lobato D, Suárez A (2011) Network-based sparse Bayesian classification. Pattern Recognit 44(4):886–900
Article MATH Google Scholar
Modai I, Kurs R, Ritsner M, Oklander S, Silver H, Segal A, Goldberg I, Mendel S (2002) Neural network identification of high-risk suicide patients. Inform Health Soc Care 27(1):39–47
Article Google Scholar
Morris-Yates A (2000) Mapping ICD-10 codes to mental health diagnostic groups. In: The SPGPPS national model for data collection and analysis. Commonwealth of Australia. Retrieved from http://www.health.gov.au, 09/09/2013, Ch. Appendix 11, pp 316–322
Nock MK, Green JG, Hwang I, McLaughlin KA, Sampson NA, Zaslavsky AM, Kessler RC (2013) Prevalence, correlates, and treatment of lifetime suicidal behavior among adolescentsresults from the national comorbidity survey replication adolescent supplementlifetime suicidal behavior among adolescents. JAMA Psychiatry 70(3):300–310
Article Google Scholar
Oquendo M, Baca-Garcia E, Artes-Rodriguez A, Perez-Cruz F, Galfalvy H, Blasco-Fontecilla H, Madigan D, Duan N (2012) Machine learning and data mining: strategies for hypothesis generation. Mol Psychiatry 17(10):956–959
Article Google Scholar
Park MY, Hastie T, Tibshirani R (2007) Averaged gene expressions for regression. Biostatistics 8(2):212–227
Article MATH Google Scholar
Pestian J, Nasrallah H, Matykiewicz P, Bennett A, Leenaars A (2010) Suicide note classification using natural language processing: a content analysis. Biomed Inform Insights 2010(3):19–28
Article Google Scholar
Poggio T, Rifkin R, Mukherjee S, Niyogi P (2004) General conditions for predictivity in learning theory. Nature 428(6981):419–422
Article Google Scholar
Pokorny AD (1983) Prediction of suicide in psychiatric patients: report of a prospective study. Arch Gen Psychiatry 40(3):249–257
Article Google Scholar
Qin P, Webb R, Kapur N, Sørensen HT (2013) Hospitalization for physical illness and risk of subsequent suicide: a population study. J Intern Med 273(1):48–58
Article Google Scholar
Ruiz F, Valera I, Blanco C, Perez-Cruz F (2012) Bayesian nonparametric modeling of suicide attempts. Advances in neural information processing systems 25, pp 1862–1870
Ryan C, Large M (2012) Suicide risk assessment: where are we now? Med J Aust 198(9):462–463
Article Google Scholar
Ryan C, Nielssen O, Paton M, Large M (2010) Clinical decisions in psychiatry should not be based on risk assessment. Australas Psychiatry 18(5):398–403
Article Google Scholar
Sandler T, Blitzer J, Talukdar PP, Ungar LH (2008) Regularized learning with networks of features. In: Advances in neural information processing systems, pp 1401–1408
Somol P, Novovicova J (2010) Evaluating stability and comparing output of feature selectors that optimize feature subset cardinality. IEEE Trans Pattern Anals Mach Intell 32(11):1921–1939
Article Google Scholar
Soneson C, Fontes M (2012) A framework for list representation, enabling list stabilization through incorporation of gene exchangeabilities. Biostatistics 13(1):129–141
Article MATH Google Scholar
Steyerberg EW (2009) Clinical prediction models: a practical approach to development, validation, and updating. Springer, Berlin
Book Google Scholar
Sun B-Y, Li J, Wu DD, Zhang X-M, Li W-B (2010) Kernel discriminant learning for ordinal regression. IEEE Trans Knowl Data Eng 22(6):906–910
Article Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodological) 58(1):267–288
MATH MathSciNet Google Scholar
Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused lasso. J R Stat Soc Ser B (Statistical Methodology) 67(1):91–108
Article MATH MathSciNet Google Scholar
Tran T, Phung D, Luo W, Harvey R, Berk M, Venkatesh S (2013) An integrated framework for suicide risk prediction. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1410–1418
Tran T, Phung D, Venkatesh S (2012) Sequential decision approach to ordinal preferences in recommender systems. In: Proceedings of the 26th AAAI conference. Toronto, ON, Canada
Tutz G (1991) Sequential models in categorical regression. Comput Stat Data Anal 11(3):275–295
Article MATH MathSciNet Google Scholar
Wang F, Lee N, Hu J, Sun J, Ebadollahi S (2012) Towards heterogeneous temporal clinical event pattern discovery: a convolutional approach. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 453–461
Xu H, Caramanis C, Mannor S (2012) Sparse algorithms are not stable: a no-free-lunch theorem. IEEE Trans Pattern Anal Mach Intell 34(1):187–193
Article MathSciNet Google Scholar
Ye J, Liu J (2012) Sparse methods for biomedical data. ACM SIGKDD Explor Newsl 14(1):4–15
Article Google Scholar
Yu L, Ding C, Loscalzo S (2008) Stable feature selection via dense feature groups. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 803–811
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Statistical Methodology) 68(1):49–67
Google Scholar
Zhou J, Liu J, Narayan VA, Ye J (2013) Modeling disease progression via multi-task learning. NeuroImage 78:233–248
Article Google Scholar
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B (Statistical Methodology) 67(2):301–320
Article MATH MathSciNet Google Scholar

Download references

Acknowledgments

We thank Ross Arblaster and Ann Larkins for helping data collections, Paul Cohen for providing management support for the project, Richard Harvey for risk stratification, Michael Berk and Richard Kennedy for valuable opinions and anonymous reviewers for helpful comments.

Author information

Authors and Affiliations

Center for Pattern Recognition and Data Analytics, School of IT, Deakin University, 75 Pigdons Rd, Waurn Ponds, VIC, 3216, Australian
Truyen Tran, Dinh Phung, Wei Luo & Svetha Venkatesh
Department of Computing, Curtin University, Bentley, WA, Australia
Truyen Tran

Authors

Truyen Tran
View author publications
You can also search for this author in PubMed Google Scholar
Dinh Phung
View author publications
You can also search for this author in PubMed Google Scholar
Wei Luo
View author publications
You can also search for this author in PubMed Google Scholar
Svetha Venkatesh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Truyen Tran.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tran, T., Phung, D., Luo, W. et al. Stabilized sparse ordinal regression for medical risk stratification. Knowl Inf Syst 43, 555–582 (2015). https://doi.org/10.1007/s10115-014-0740-4

Download citation

Received: 23 September 2013
Revised: 14 January 2014
Accepted: 27 February 2014
Published: 17 March 2014
Issue Date: June 2015
DOI: https://doi.org/10.1007/s10115-014-0740-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stabilized sparse ordinal regression for medical risk stratification

Abstract

Access this article

Similar content being viewed by others

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Machine and deep learning for longitudinal biomedical data: a review of methods and applications

Deep learning for survival analysis: a review

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Stabilized sparse ordinal regression for medical risk stratification

Abstract

Access this article

Similar content being viewed by others

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Machine and deep learning for longitudinal biomedical data: a review of methods and applications

Deep learning for survival analysis: a review

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation