Convolutional Decision Trees for Feature Learning and Segmentation

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8753)


Most computer vision and especially segmentation tasks require to extract features that represent local appearance of patches. Relevant features can be further processed by learning algorithms to infer posterior probabilities that pixels belong to an object of interest. Deep Convolutional Neural Networks (CNN) define a particularly successful class of learning algorithms for semantic segmentation, although they proved to be very slow to train even when employing special purpose hardware. We propose, for the first time, a general purpose segmentation algorithm to extract the most informative and interpretable features as convolution kernels while simultaneously building a multivariate decision tree. The algorithm trains several orders of magnitude faster than regular CNNs and achieves state of the art results in processing quality on benchmark datasets.


Information Gain Sparse Code Convolutional Neural Network Convolution Kernel Segmentation Task 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Matlab image processing toolbox.
  2. 2.
    Becker, C., Ali, K., Knott, G., Fua, P.: Learning context cues for synapse segmentation in EM volumes. In: Ayache, N., Delingette, H., Golland, P., Mori, K. (eds.) MICCAI 2012, Part I. LNCS, vol. 7510, pp. 585–592. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Borenstein, E., Ullman, S.: Class-specific, top-down segmentation. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part II. LNCS, vol. 2351, pp. 109–122. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  4. 4.
    Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1124–1137 (2004)CrossRefGoogle Scholar
  5. 5.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks, Monterey (1984)zbMATHGoogle Scholar
  6. 6.
    Cardona, A., Saalfeld, S., Preibisch, S., Schmid, B., Cheng, A., Pulokas, J., Tomancak, P., Hartenstein, V.: An integrated micro-and macroarchitectural analysis of the drosophila brain by computer-assisted serial section electron microscopy. PLoS Biol. 8(10), e1000502 (2010)CrossRefGoogle Scholar
  7. 7.
    Ciresan, D., Giusti, A., Schmidhuber, J., et al.: Deep neural networks segment neuronal membranes in electron microscopy images. Adv. Neural Inf. Process. Syst. 25, 2852–2860 (2012)Google Scholar
  8. 8.
    Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2188–2202 (2011)CrossRefGoogle Scholar
  9. 9.
    Graves, A., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 545–552 (2008)Google Scholar
  10. 10.
    Heath, D., Kasif, S., Salzberg, S.: Induction of oblique decision trees (1993)Google Scholar
  11. 11.
    Kreutz-Delgado, K., Murray, J.F., Rao, B.D., Engan, K., Lee, T.W., Sejnowski, T.J.: Dictionary learning algorithms for sparse representation. Neural Comput. 15(2), 349–396 (2003)CrossRefzbMATHGoogle Scholar
  12. 12.
    Laptev, D., Vezhnevets, A., Dwivedi, S., Buhmann, J.M.: Anisotropic ssTEM Image segmentation using dense correspondence across sections. In: Ayache, N., Delingette, H., Golland, P., Mori, K. (eds.) MICCAI 2012, Part I. LNCS, vol. 7510, pp. 323–330. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  13. 13.
    Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)CrossRefGoogle Scholar
  14. 14.
    Levinshtein, A., Sminchisescu, C., Dickinson, S.: Optimal image and video closure by superpixel grouping. Int. J. Comput. Vis. 100(1), 99–119 (2012)CrossRefGoogle Scholar
  15. 15.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999)Google Scholar
  16. 16.
    Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Supervised dictionary learning (2008). arXiv preprint arXiv:0809.3083
  17. 17.
    Montillo, A., Tu, J., Shotton, J., Winn, J., Iglesias, J., Metaxas, D., Criminisi, A.: Entanglement and differentiable information gain maximization. In: Criminisi, A., Shotton, J. (eds.) Decision Forests for Computer Vision and Medical Image Analysis, pp. 273–293. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  18. 18.
    Nocedal, J.: Updating quasi-newton matrices with limited storage. Math. Comput. 35(151), 773–782 (1980)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Sandberg, K., Brega, M.: Segmentation of thin structures in electron micrographs using orientation fields. J. Struct. Biol. 157(2), 403–415 (2007)CrossRefGoogle Scholar
  20. 20.
    Sklansky, J., Michelotti, L.: Locally trained piecewise linear classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 2, 101–111 (1980)CrossRefzbMATHGoogle Scholar
  21. 21.
    Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM (2008)Google Scholar
  22. 22.
    Yang, J., Jiang, Y.G., Hauptmann, A.G., Ngo, C.W.: Evaluating bag-of-visual-words representations in scene classification. In: Proceedings of the International Workshop on Multimedia Information Retrieval, pp. 197–206. ACM (2007)Google Scholar
  23. 23.
    Zhu, L., Chen, Y., Yuille, A.: Learning a hierarchical deformable template for rapid deformable object parsing. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1029–1043 (2010)CrossRefGoogle Scholar
  24. 24.
    Zhu, Q., Yeh, M.C., Cheng, K.T., Avidan, S.: Fast human detection using a cascade of histograms of oriented gradients. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1491–1498. IEEE (2006)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.ETH ZurichZurichSwitzerland

Personalised recommendations