Scandent Tree: A Random Forest Learning Method for Incomplete Multimodal Datasets

  • Soheil Hor
  • Mehdi Moradi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9349)


We propose a solution for training random forests on incomplete multimodal datasets where many of the samples are non-randomly missing a large portion of the most discriminative features. For this goal, we present the novel concept of scandent trees. These are trees trained on the features common to all samples that mimic the feature space division structure of a support decision tree trained on all features. We use the forest resulting from ensembling these trees as a classification model. We evaluate the performance of our method for different multimodal sample sizes and single modal feature set sizes using a publicly available clinical dataset of heart disease patients and a prostate cancer dataset with MRI and gene expression modalities. The results show that the area under ROC curve of the proposed method is less sensitive to the multimodal dataset sample size, and that it outperforms the imputation methods especially when the ratio of multimodal data to all available data is small.


Random Forest Dynamic Contrast Enhance Multimodal Data Single Modality Tree Prostate Cancer Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Liu, J., Calhoun, V.D.: A review of multivariate analyses in imaging genetics. Frontiers in Neuroinformatics 8, 29 (2014)Google Scholar
  2. 2.
    Rubin, D.B.: Multiple imputation for nonresponse in surveys, vol. 81. John Wiley & Sons (2004)Google Scholar
  3. 3.
    Gold, M.S., Bentler, P.M.: Treatments of missing data: A Monte Carlo comparison of RBHDI, iterative stochastic regression imputation, and expectation-maximization. Structural Equation Modeling 7(3), 319–355 (2000)CrossRefGoogle Scholar
  4. 4.
    Kong, A., Liu, J.S., Wong, W.H.: Sequential imputations and bayesian missing data problems. Journal of the American Statistical Association 89(425), 278–288 (1994)CrossRefzbMATHGoogle Scholar
  5. 5.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  6. 6.
    Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)Google Scholar
  7. 7.
    Therneau, T.M., Atkinson, B., Ripley, B.: rpart: Recursive partitioning. R package version 3.1-46. Ported to R by Brian Ripley 3 (2010)Google Scholar
  8. 8.
    Lichman, M.: UCI machine learning repository (2013)Google Scholar
  9. 9.
    Detrano, R., Janosi, A., Steinbrunn, W., Pfisterer, M., Schmid, J.J., Sandhu, S., Guppy, K.H., Lee, S., Froelicher, V.: International application of a new probability algorithm for the diagnosis of coronary artery disease. The American Journal of Cardiology 64(5), 304–310 (1989)CrossRefGoogle Scholar
  10. 10.
    Haq, N.F., Kozlowski, P., Jones, E.C., Chang, S.D., Goldenberg, S.L., Moradi, M.: A data-driven approach to prostate cancer detection from dynamic contrast enhanced MRI. Computerized Medical Imaging and Graphics 41, 37–45 (2015)CrossRefGoogle Scholar
  11. 11.
    Moradi, M., Salcudean, S.E., Chang, S.D., Jones, E.C., Buchan, N., Casey, R.G., Goldenberg, S.L., Kozlowski, P.: Multiparametric MRI maps for detection and grading of dominant prostate tumors. Journal of Magnetic Resonance Imaging 35(6), 1403–1413 (2012)CrossRefGoogle Scholar
  12. 12.
    Erho, N., et al.: Discovery and validation of a prostate cancer genomic classifier that predicts early metastasis following radical prostatectomy. PloS One 8(6), e66855 (2013)Google Scholar
  13. 13.
    National Institutes of Health: National cancer institute: PDQ genetics of prostate cancer (Date last modified February 20, 2015)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Soheil Hor
    • 1
  • Mehdi Moradi
    • 2
  1. 1.University of British ColumbiaVancouverCanada
  2. 2.IBM Almaden Research CenterSan JoseUSA

Personalised recommendations