Entanglement and Differentiable Information Gain Maximization

  • A. Montillo
  • J. Tu
  • J. Shotton
  • J. Winn
  • J. E. Iglesias
  • D. N. Metaxas
  • A. Criminisi
Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR)


Decision forests can be thought of as a flexible optimization toolbox with many avenues to alter or recombine the underlying architectural components and improve recognition accuracy and efficiency. In this chapter, we present two fundamental approaches for re-architecting decision forests that yield higher prediction accuracy and shortened decision time.

The first is entanglement, i.e. using the learned tree structure and intermediate probabilities computed in nodes closer to the root to affect the training of other nodes deeper in the trees. Unlike more conventional classifiers which assume that all data points (even those neighboring in space or time) are IID, the entanglement approach learns semantic correlation in non IID data. To demonstrate, we build an entangled decision forest (EDF) that exploits spatial correlation in human anatomy by simultaneously labeling voxels in computed tomography (CT) scans into 12 anatomical structures.

The second contribution is the formulation of information gain as a function that is differentiable with respect to the parameters of the split node weak learner. This provides increased confidence and accuracy of maximum margin boundary localization and reduces classification time by using a few, shallow trees. We further extend the method to incorporate training label confidence, when available, into the information gain maximization. Due to bagging and random feature subset selection, we can retain decision forest virtues such as resiliency to overfitting. To demonstrate, we build a gradient ascent decision forest (GADF) that tracks visual objects in videos. For both approaches, superior accuracy and computational efficiency is shown in quantitative comparisons with state of the art algorithms.


Information Gain Split Function Proposal Distribution Maximum Margin Weak Learner 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 5.
    Amit Y, Geman D (1997) Shape quantization and recognition with randomized trees. Neural Comput 9(7) Google Scholar
  2. 9.
    Avidan S (2001) Support vector tracking. In: Proc IEEE conf computer vision and pattern recognition (CVPR), vol 1 Google Scholar
  3. 11.
    Avidan S (2007) Ensemble tracking. IEEE Trans Pattern Anal Mach Intell 29(2) Google Scholar
  4. 44.
    Breiman L (2001) Random forests. Mach Learn 45(1) Google Scholar
  5. 52.
    Budvytis I, Badrinarayanan V, Cipolla R (2010) Label propagation in complex video sequences using semi-supervised learning. In: Proc British machine vision conference (BMVC) Google Scholar
  6. 69.
    Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5) Google Scholar
  7. 77.
    Criminisi A, Shotton J, Bucciarelli S (2009) Decision forests with long-range spatial context for organ localization in CT volumes. In: MICCAI workshop on probabilistic models for medical image analysis (PMMIA) Google Scholar
  8. 78.
    Criminisi A, Shotton J, Robertson D, Konukoglu E (2010) Regression forests for efficient anatomy detection and localization in CT studies. In: MICCAI workshop on medical computer vision: recognition techniques and applications in medical imaging, Beijing. Springer, Berlin Google Scholar
  9. 80.
    Criminisi A, Shotton J, Konukoglu E (2012) Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found Trends Comput Graph Vis 7(2–3) Google Scholar
  10. 100.
    Everingham M, van Gool L, Williams C, Winn J, Zisserman A (2010) The Pascal visual object classes (VOC) challenge 2010. Int J Comput Vis 88 Google Scholar
  11. 111.
    Frank A, Asuncion A (2010) UCI machine learning repository Google Scholar
  12. 120.
    Gall J, Yao A, Razavi N, van Gool LJ, Lempitsky VS (2011) Hough forests for object detection, tracking, and action recognition. IEEE Trans Pattern Anal Mach Intell 33(11) Google Scholar
  13. 125.
    Geremia E, Menze B, Clatz O, Konukoglu E, Criminisi A, Ayache N (2010) Spatial decision forests for MS lesion segmentation in multi-channel MR images. In: Proc medical image computing and computer assisted intervention (MICCAI). Springer, Berlin Google Scholar
  14. 128.
    Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 36(1) Google Scholar
  15. 139.
    Grabner H, Grabner M, Bischof H (2006) Real-time tracking via on-line boosting. In: Proc British machine vision conference (BMVC) Google Scholar
  16. 158.
    Heath D, Kasif S, Salzberg S (1993) Induction of oblique decision trees. J Artif Intell Res 2(2) Google Scholar
  17. 173.
    John GH (1995) Robust linear discriminant trees. In: Fifth intl workshop on artificial intelligence and statistics Google Scholar
  18. 178.
    Kalal Z, Matas J, Mikolajczyk K (2010) P-N learning: bootstrapping binary classifiers by structural constraints. In: Proc IEEE conf computer vision and pattern recognition (CVPR) Google Scholar
  19. 212.
    Lempitsky V, Verhoek M, Noble A, Blake A (2009) Random forest classification for automatic delineation of myocardium in real-time 3D echocardiography. In: Workshop on functional imaging and modelling of the heart (FIMH). Springer, Berlin Google Scholar
  20. 246.
    Menze B, Kelm BM, Splitthoff DN, Koethe U, Hamprecht FA (2011) On oblique random forests. In: Proc European conf on machine learning (ECML/PKDD) Google Scholar
  21. 250.
    Montillo A (2011) Context selective decision forests and their application to lung segmentation in CT images. In: MICCAI workshop on pulmonary image analysis Google Scholar
  22. 252.
    Montillo A, Shotton J, Winn J, Iglesias J, Metaxas D, Criminisi A (2011) Entangled decision forests and their application for semantic segmentation of CT images. In: Proc information processing in medical imaging (IPMI). Springer, Berlin Google Scholar
  23. 260.
    Murthy SK, Kasif S, Salzberg S (1994) A system for induction of oblique decision trees. arXiv:cs/9408103
  24. 290.
  25. 302.
    Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo Google Scholar
  26. 317.
    Saffari A, Leistner C, Santner J, Godec M, Bischoff H (2009) On-line random forests. In: ICCV workshop on on-line learning for computer vision Google Scholar
  27. 341.
    Shotton J, Johnson M, Cipolla R (2008) Semantic texton forests for image categorization and segmentation. In: Proc IEEE conf computer vision and pattern recognition (CVPR) Google Scholar
  28. 342.
    Shotton J, Winn JM, Rother C, Criminisi A (2009) TextonBoost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int J Comput Vis 81(1) Google Scholar
  29. 374.
    Tu Z (2005) Probabilistic boosting-tree: learning discriminative models for classification, recognition, and clustering. In: Proc IEEE intl conf on computer vision (ICCV), Beijing, China, October 2005, vol 2 Google Scholar
  30. 375.
    Tu Z, Bai X (2010) Auto-context and its application to high-level vision tasks and 3D brain image segmentation. IEEE Trans Pattern Anal Mach Intell 32(10) Google Scholar
  31. 377.
    UCI Machine Learning Repository.
  32. 383.
    Vedaldi A, Blaschko M, Zisserman A (2011) Learning equivariant structured output SVM regressors. In: Proc IEEE intl conf on computer vision (ICCV) Google Scholar
  33. 411.
    Yi Z, Criminisi A, Shotton J, Blake A (2009) Discriminative, semantic segmentation of brain tissue in MR images. In: Proc medical image computing and computer assisted intervention (MICCAI). Springer, Berlin Google Scholar
  34. 413.
    Yin P, Criminisi A, Winn J, Essa I (2007) Tree based classifiers for bilayer video segmentation. In: Proc IEEE conf computer vision and pattern recognition (CVPR) Google Scholar
  35. 419.
    Zheng Y, Georgescu B, Comaniciu D (2009) Marginal space learning for efficient detection of 2D/3D anatomical structures in medical images. In: Proc information processing in medical imaging (IPMI). Springer, Berlin Google Scholar

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  • A. Montillo
    • 1
  • J. Tu
    • 1
  • J. Shotton
    • 2
  • J. Winn
    • 2
  • J. E. Iglesias
    • 3
  • D. N. Metaxas
    • 4
  • A. Criminisi
    • 2
  1. 1.General Electric Global ResearchNiskayunaUSA
  2. 2.Microsoft Research LtdCambridgeUK
  3. 3.Massachusetts General HospitalHarvard Medical SchoolBostonUSA
  4. 4.RutgersPiscatawayUSA

Personalised recommendations