Machine Vision and Applications

, Volume 24, Issue 5, pp 1043–1053 | Cite as

Large-scale gaussian process multi-class classification for semantic segmentation and facade recognition

  • Björn FröhlichEmail author
  • Erik Rodner
  • Michael Kemmler
  • Joachim Denzler
Original Paper


This paper deals with the task of semantic segmentation, which aims to provide a complete description of an image by inferring a pixelwise labeling. While pixelwise classification is a suitable approach to achieve this goal, state-of-the-art kernel methods are generally not applicable since training and testing phase involve large amounts of data. We address this problem by presenting a method for large-scale inference with Gaussian processes. Standard limitations of Gaussian process classifiers in terms of speed and memory are overcome by pre-clustering the data using decision trees. This leads to a breakdown of the entire problem into several independent classification tasks whose complexity is controlled by the maximum number of training examples allowed in the tree leaves. We additionally propose a technique which allows for computing multi-class probabilities by incorporating uncertainties of the classifier estimates. The approach provides pixelwise semantics for a wide range of applications and different image types such as those from scene understanding, defect localization, and remote sensing. Our experiments are performed with a facade recognition application that shows the significant performance gain achieved by our method compared to previous approaches.


Large scale classification Gaussian processes Random decision forest Semantic segmentation  Facade recognition Scene interpretation 



This work was partially supported by the Graduate School on Image Processing and Image Interpretation funded by the state of Thuringia/Germany.


  1. 1.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)zbMATHCrossRefGoogle Scholar
  2. 2.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman and Hall, London (1984)zbMATHGoogle Scholar
  3. 3.
    Broderick, T., Gramacy, R.B.: Treed gaussian process models for classification. In: Classification as a Tool for Research, Studies in Classification, Data Analysis and Knowledge Organization, pp. 101–108 (2010)Google Scholar
  4. 4.
    Candela, Q.J., Rasmussen, C.E.: A unifying view of sparse approximate gaussian process regression. J. Mach. Learn. Res. 6, 1939–1959 (2005)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Chang, F., Guo, C.Y., Lin, X.R., Lu, C.J.: Tree decomposition for large-scale SVM problems. J. Mach. Learn. Res. 11, 2935–2972 (2010)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Chen, C., Freedman, D., Lampert, C.: Enforcing topological constraints in random field image segmentation. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11) (2011)Google Scholar
  7. 7.
    Chen, T., Ren, J.: Bagging for gaussian process regression. Neurocomputing 72(7–9), 1605–1610 (2009)CrossRefGoogle Scholar
  8. 8.
    Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)CrossRefGoogle Scholar
  9. 9.
    Csurka, G., Perronnin, F.: An efficient approach to semantic segmentation. IJCV 95(2), 198–212 (2011)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Domke, J.: Crossover random fields. J. Mach. Learn. Res. (2009)Google Scholar
  11. 11.
    Dumont, M., Marée, R., Wehenkel, L., Geurts, P.: Fast multi-class image annotation with random subwindows and multiple output randomized trees. In: Proceedings of the 4th International Conference on Computer Vision, Theory and Applications (VISAPP), vol. 2, pp. 196–203 (2009)Google Scholar
  12. 12.
    Fröhlich, B., Rodner, E., Denzler, J.: A fast approach for pixelwise labeling of facade images. In: Proceedings of the International Conference on Pattern Recognition (ICPR’10), pp. 3029–3032 (2010)Google Scholar
  13. 13.
    Fröhlich, B., Rodner, E., Kemmler, M., Denzler, J.: Efficient gaussian process classification using random decision forests. Pattern Recogn. Image Anal. 21, 184–187 (2011)CrossRefGoogle Scholar
  14. 14.
    Gool, L.J.V., Zeng, G., den Borre, F.V., Müller, P.: Towards mass-produced building models. In: Photogrammetric Image Analysis, pp. 209–220 (2007)Google Scholar
  15. 15.
    Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D.: Multi-class segmentation with relative location prior. Int. J. Comput. Vis. 80(3), 300–316 (2008). doi: 10.1007/s11263-008-0140-x Google Scholar
  16. 16.
    Huang, Q.X., Han, M., Wu, B., Ioffe, S.: A hierarchical conditional random field model for labeling and segmenting images of street scenes. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1953–1960. IEEE, New York (2011)Google Scholar
  17. 17.
    Iba, W., Langley, P.: Induction of one-level decision trees. In: Proceedings of the International Conference of Machine Learning (ICML’92) (1992)Google Scholar
  18. 18.
    Kapoor, A., Grauman, K., Urtasun, R., Darrell, T.: Gaussian processes for object categorization. Int. J. Comput. Vis. 88(2), 169–188 (2010)CrossRefGoogle Scholar
  19. 19.
    Kohli, P., Ladicky, L., Torr, P.: Robust higher order potentials for enforcing label consistency. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8 (2008). doi: 10.1109/CVPR.2008.4587417
  20. 20.
    Korč, F., Förstner, W.: etrims image database for interpreting images of man-made scenes. Technical report, Department of Photography, University of Bonn (2009).
  21. 21.
    Lawrence, N.D., Jordan, M.I.: Semi-supervised learning via gaussian processes. In: Advances in Neural Information Processing Systems, pp. 753–760 (2005)Google Scholar
  22. 22.
    Leistner, C., Saffari, A., Santner, J., Bischof, H.: Semi-supervised random forests. In: Proceedings of the 2009 International Conference on Computer Vision (ICCV’09), pp. 506–513 (2009)Google Scholar
  23. 23.
    Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. In: Proceedings of the 2007 International Conference on Computer Vision (ICCV’07), pp. 1–8 (2007)Google Scholar
  24. 24.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). MIT Press, Cambridge (2005)Google Scholar
  25. 25.
    Ripperda, N., Brenner, C.: Evaluation of structure recognition using labelled facade images. In: Proceedings of the DAGM, pp. 532–541 (2009)Google Scholar
  26. 26.
    Rodner, E., Hegazy, D., Denzler, J.: Multiple kernel gaussian process classification for generic 3d object recognition from time-of-flight images. In: Proceedings of the International Conference on Image and Vision Computing (2010)Google Scholar
  27. 27.
    van de Sande, K., Gevers, T., Snoek, C.: Evaluating color descriptors for object and scene recognition. PAMI 32, 1582–1596 (2010)Google Scholar
  28. 28.
    Schölkopf, B., Smola, A.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)Google Scholar
  29. 29.
    Shen, Y., Ng, A., Seeger, M.: Fast gaussian process regression using kd-trees. In. Advances in Neural Information Processing Systems, pp. 1225–1232 (2006)Google Scholar
  30. 30.
    Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08), pp. 1–8 (2008)Google Scholar
  31. 31.
    Shotton, J., Winn, J.M., Rother, C., Criminisi, A.: Textonboost: joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Proceedings of the European Conference of Computer Vision (ECCV’06), pp. 1–15 (2006)Google Scholar
  32. 32.
    Simon, L., Teboul, O., Koutsourakis, P., Paragios, N.: Random exploration of the procedural space for single-view 3d modeling of buildings. Int. J. Comput. Vis. 93, 253–271 (2011)MathSciNetzbMATHCrossRefGoogle Scholar
  33. 33.
    Snelson, E., Ghahramani, Z.: Sparse gaussian processes using pseudo-inputs. In: Advances in Neural Information Processing Systems (2006)Google Scholar
  34. 34.
    Teboul, O.: Shape Grammar Parsing: Application to Image-Based Modeling. PhD thesis, Ecole Centrale de Paris (2011)Google Scholar
  35. 35.
    Teboul, O., Kokkinos, I., Koutsourakis, P., Simon, L., Paragios, N.: Shape grammar parsing via reinforcement learning. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2313–2319 (2011)Google Scholar
  36. 36.
    Teboul, O., Simon, L., Koutsourakis, P., Paragios, N.: Segmentation of building facades using procedural shape priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2010)Google Scholar
  37. 37.
    Tipping, M.E.: Sparse bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1, 211–244 (2001)MathSciNetzbMATHGoogle Scholar
  38. 38.
    Tresp, V.: A bayesian committee machine. Neural Comput. 12, 2719–2741 (2000)CrossRefGoogle Scholar
  39. 39.
    Tsang, I.W., Kocsor, A., Kwok, J.T.: Simpler core vector machines with enclosing balls. In: Proceedings of the 24th international conference on Machine learning, pp. 911–918 (2007)Google Scholar
  40. 40.
    Urtasun, R., Darrell, T.: Sparse probabilistic regression for activity-independent human pose inference. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08) (2008)Google Scholar
  41. 41.
    Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)zbMATHCrossRefGoogle Scholar
  42. 42.
    Williams, C.K., Seeger, M.: Using the nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems, pp. 682–688 (2001)Google Scholar
  43. 43.
    Xiao, J., Fang, T., Zhao, P., Lhuillier, M., Quan, L.: Image-based street-side city modeling. ACM Trans. Graph. 28(5) (2009)Google Scholar
  44. 44.
    Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3485–3492 (2010). doi: 10.1109/CVPR.2010.5539970
  45. 45.
    Xiao, J., Quan, L.: Multiple view semantic segmentation for street view images. In: Proceedings of 12th IEEE International Conference on Computer Vision, pp. 686–693 (2009)Google Scholar
  46. 46.
    Yang, M.Y., Forstner, W.: A hierarchical conditional random field model for labeling and classifying images of man-made scenes. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 196–203 (2011). doi: 10.1109/ICCVW.2011.6130243
  47. 47.
    Yang, M.Y., Förstner, W.: Regionwise classification of building facade images. In: Photogrammetric Image Analysis. Lecture Notes in Computer Science vol. 6952, pp. 209–220. Springer, Berlin (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Björn Fröhlich
    • 1
    Email author
  • Erik Rodner
    • 1
    • 2
  • Michael Kemmler
    • 1
  • Joachim Denzler
    • 1
  1. 1.Friedrich Schiller UniversityJenaGermany
  2. 2.ICSIUC BerkeleyUSA

Personalised recommendations