Joint Prediction of Chronic Conditions Onset: Comparing Multivariate Probits with Multiclass Support Vector Machines

  • Shima Ghassem PourEmail author
  • Federico Girosi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9653)


We consider the problem of building accurate models that can predict, in the short term (2–3 years), the onset of one or more chronic conditions at individual level. Five chronic conditions are considered: heart disease, stroke, diabetes, hypertension and cancer. Covariates for the models include standard demographic/socio-economic variables, risk factors and the presence of the chronic conditions at baseline. We compare two predictive models. The first model is the multivariate probit (MVP), chosen because it allows to model correlated outcome variables. The second model is the Multiclass Support Vector Machine (MSVM), a leading predictive method in machine learning. We use Australian data from the Social, Economic, and Environmental Factory (SEEF) study, a follow up to the 45 and Up Study survey, that contains two repeated observations of 60,000 individuals in NSW, over age 45. We find that MSVMs predictions have specificity rates similar to those of MVPs, but sensitivity rates that are on average 12 % points larger than those of MVPs, translating in a large average improvement in sensitivity of 30 %.


Chronic Condition Deep Learn Latent Variable Model Multiclass Support Vector Machine Multivariate Probit Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This research was completed using data collected through the 45 and Up Study ( The 45 and Up Study is managed by the Sax Institute in collaboration with major partner Cancer Council NSW, and partners the National Heart Foundation of Australia (NSW Division), NSW Ministry of Health, beyondblue, NSW Government Family & Community Services Carers, Ageing and Disability Inclusion, and the Australian Red Cross Blood Service. We thank the many thousands of people participating in the 45 and Up Study. We also thank Capital Markets CRC,that has sponsored this research.


  1. 1.
    World Health Organization: Preventing Chronic Diseases. A Vital Investment, World Health Organization, Geneva (2005)Google Scholar
  2. 2.
    Goldman, D., Zheng, Y., Girosi, F., Michaud, P.-C., Olshansky, J., Cutler, D., Rowe, J.: The benefits of risk factor prevention in americans aged 51 years and older. Am. J. Public Health 99(11), 2096–2101 (2009)CrossRefGoogle Scholar
  3. 3.
    Lymer, S., Brown, L., Duncan, A.: Modelling the health system in an ageing Australia, using a dynamic microsimulation model. Working Paper 11/09, NATSEM at the University of Canberra, June 2011Google Scholar
  4. 4.
    Boyle, J.P., Thompson, T.J., Gregg, E.W., Barker, L.E., Williamson, D.F.: Projection of the year 2050 burden of diabetes in the US adult population: dynamic modeling of incidence, mortality, and prediabetes prevalence. Popul. Health Metr. 8(1), 29 (2010)CrossRefGoogle Scholar
  5. 5.
    Greene, W.: Econometric Analysis. Pearson Education, New Jersey (2003)Google Scholar
  6. 6.
    Cappellari, L., Jenkins, S.: Calculation of multivariate normal probabilities by simulation, with applications to maximum simulated likelihood estimation. Stata J. 6(2), 156–189 (2006)Google Scholar
  7. 7.
    Collaborators, U.S.: Cohort profile: the 45 and up study. Int. J. Epidemiol. 37(5), 941 (2008)Google Scholar
  8. 8.
    Mealing, N., Banks, E., Jorm, L., Steel, D., Clements, M., Rogers, K.: Investigation of relative risk estimates from studies of the same population with contrasting response rates and designs. BMC Med. Res. Methodol. 10(1), 26 (2010)CrossRefGoogle Scholar
  9. 9.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  10. 10.
    Scholkopf, B., Smola, A.J.: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)zbMATHGoogle Scholar
  11. 11.
    Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, Cambridge (2000)CrossRefzbMATHGoogle Scholar
  12. 12.
    Smits, G.F., Jordaan, E.M.: Improved SVM regression using mixtures of kernels. In: Proceedings of the 2002 International Joint Conference on Neural Networks, IJCNN 2002, vol. 3, pp. 2785–2790. IEEE (2002)Google Scholar
  13. 13.
    Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265–292 (2002)zbMATHGoogle Scholar
  14. 14.
    Weston, J., Watkins, C.: Support vector machines for multi-class pattern recognition. ESANN 99, 219–224 (1999)Google Scholar
  15. 15.
    Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Centre for Health ResearchWestern Sydney UniversitySydneyAustralia
  2. 2.Capital Markets CRCSydneyAustralia

Personalised recommendations