The Importance of Structure

Part of the Springer Tracts in Advanced Robotics book series (STAR, volume 100)


Many tasks in robotics and computer vision are concerned with inferring a continuous or discrete state variable from observations and measurements from the environment. Due to the high-dimensional nature of the input data the inference is often cast as a two stage process: first a low-dimensional feature representation is extracted on which secondly a learning algorithm is applied. Due to the significant progress that have been achieved within the field of machine learning over the last decade focus have placed at the second stage of the inference process, improving the process by exploiting more advanced learning techniques applied to the same (or more of the same) data. We believe that for many scenarios significant strides in performance could be achieved by focusing on representation rather than aiming to alleviate inconclusive and/or redundant information by exploiting more advanced inference methods. This stems from the notion that; given the “correct” representation the inference problem becomes easier to solve. In this paper we argue that one important mode of information for many application scenarios is not the actual variation in the data but the rather the higher order statistics as the structure of variations. We will exemplify this through a set of applications and show different ways of representing the structure of data.


Feature Space Global Structure Local Patch Latent Variable Model Principled Manner 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    R. Rensink, J. ORegan, J. Clark, On the failure to detect changes in scenes across brief interruptions. Vis. Cogn. 7(1), 127–145 (2000)CrossRefGoogle Scholar
  2. 2.
    D.J. Simons, C.F. Chabris, Gorillas in our midst: sustained inattentional blindness for dynamic events. Perception 28, 1059–1074 (1999)CrossRefGoogle Scholar
  3. 3.
    D.G. Lowe, Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  4. 4.
    B.D. Argalla, S. Chernova, M. Veloso, B. Browning, A survey of robot learning from demonstration. Robot. Auton. Syst. 57(5), 469–548 (2009)CrossRefGoogle Scholar
  5. 5.
    V. Kruger, D. Kragic, A. Ude, C. Geib, The meaning of action: a review on action recognition and mapping. Adv. Robot. 21(13), 1473–1501 (2007)Google Scholar
  6. 6.
    E. Aksoy, A. Abramov, F. Wörgötter, B. Dellen, Categorizing object-action relations from semantic scene graphs, in IEEE International Conference on Robotics and Automation, 2010, pp. 398–405Google Scholar
  7. 7.
    I. Laptev, P. Perez, Retrieving actions in movies, in IEEE International Conference on Computer Vision, 2007, pp. 1–8Google Scholar
  8. 8.
    I. Laptev, M. Marszalek, C. Schmid, B. Rozenfeld, Learning realistic human actions from movies, in IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1–8Google Scholar
  9. 9.
    H. Kjellstro¨m, J. Romero, D. Kragic, Visual object-action recognition: Inferring object affordances from human demonstration. Comput. Vis. Image Underst. 115, 81–90 (2011)CrossRefGoogle Scholar
  10. 10.
    G. Luo, N. Bergström, C.H. Ek, D. Kragic, Representing actions with kernels, in International Conference of Intelligent Robots and Systems, 2011Google Scholar
  11. 11.
    H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, C. Watkins, Text classification using string kernels. J. Mach. Learn. Res. 2, 419–444 (2002)zbMATHGoogle Scholar
  12. 12.
    N. Cristianini, J. Shawe-Taylor, An introduction to Support Vector Machines and Other Kernel Based Learning Methods (Cambridge University Press, 2006)Google Scholar
  13. 13.
    V. Kruger, D.L. Herzog, Sanmohan A. Ude, D. Kragic, Learning actions from observations. Robot. Autom. Mag. 17(2), 30–43 (2010)CrossRefGoogle Scholar
  14. 14.
    D. Comaniciu, P. Meer, Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)CrossRefGoogle Scholar
  15. 15.
    Y. Boykov, M.-P. Jolly, Interactive graph cuts for optimal boundary & region segmentation of objects in N-D Images, in IEEE International Conference on Computer Vision, 2005, pp. 105–112Google Scholar
  16. 16.
    N. Bergström, C.H. Ek, M. Björkman, D. Kragic, Scene understanding through interactive perception, in International Conference on Vision Systems, 2011Google Scholar
  17. 17.
    M. Everingham, L. Van Gool, C. K. Williams, J. Winn, A. Zisserman, The PASCAL Visual Object Classes Challenge 2010 (VOC2010) (2010)Google Scholar
  18. 18.
    R. Rusu, N. Blodow, M. Beetz, Fast Point Feature Histograms (FPFH) for 3D registration, in International Conference on Robotics and Automation, 2009, pp. 3212–3217Google Scholar
  19. 19.
    R. Rusu, A. Holzbach, N. Blodow, M. Beetz, Fast geometric point labeling using conditional random fields, in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009, pp. 7–12Google Scholar
  20. 20.
    R.B. Rusu, Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments, Ph.D. thesis, Technische Universität München (2009)Google Scholar
  21. 21.
    J.B. Tenenbaum, V. de Silva, J.C. Langford, A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)CrossRefGoogle Scholar
  22. 22.
    Y. Teh, M. Jordan, M. Beal, D. Blei, Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    J. Pitman, Combinatorial Stochastic Processes (Springer, St. Flour Summer School, Berlin, 2006)zbMATHGoogle Scholar
  24. 24.
    H. Wallach, S. Jensen, L. Dicker, K. Heller, An alternative prior process for nonparametric bayesian clustering, in International Conference on Artificial Intelligence and Statistics Google Scholar
  25. 25.
    R. Adams, H. Wallach, Z. Ghahramani, Learning the Structure of Deep Sparse Graphical Models, in International Conference on Artificial Intelligence and Statistics, 2010Google Scholar
  26. 26.
    T.L. Griffiths, Z. Ghahrmani, Infinite latent feature models and the Indian buffet process, in: Advances in Neural Information Processing, 2006, pp. 475–482Google Scholar
  27. 27.
    D. Song, K. Huebner, V. Kyrki, D. Kragic, Learning task constraints for robot grasping using graphical models, in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010, pp. 1579–1585Google Scholar
  28. 28.
    C.H. Ek, D. Song, D. Kragic, Learning conditional structures in graphical models from a large set of observation streams through efficient discretisation, in International Conference on Robotics and Automation, Workshop on Manipulation under Uncertainty, 2011Google Scholar
  29. 29.
    D. Song, C.H. Ek, K. Huebner, D. Kragic, Embodiment-specific representation of robot grasping using graphical models and latent-space discretization, in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011, pp. 1–8Google Scholar
  30. 30.
    D. Song, C.H. Ek, K. Huebner, D. Kragic, Multivariate discretization for bayesian network structure learning in robot grasping, in International Conference on Robotics and Automation, 2011Google Scholar
  31. 31.
    M. Titsias, N. Lawrence, Bayesian gaussian process latent variable model, in International Conference on Artificial Intelligence and Statistics, 2010Google Scholar
  32. 32.
    G. Carlsson, Topology and data. Am. Math. Soc. 46(2), 255–308 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    G. Shakhnarovich, T. Darrell, P. Indyk, Nearest-Neighbor Methods in Learning and Vision (MIT Press, 2005)Google Scholar
  34. 34.
    G. Shakhnarovich, P. Viola, T. Darrell, Fast pose estimation with parameter-sensitive hashing, in IEEE International Conference on Computer Vision, 2003, pp. 750–757Google Scholar
  35. 35.
    O. Boiman, E. Shechtman, M. Irani, In defense of nearest-neighbor based image classification, in Computer Vision and Pattern Recognition, 2008, pp. 1–8Google Scholar
  36. 36.
    K.Q. Weinberger, F. Sha, L. K. Saul, Learning a kernel matrix for nonlinear dimensionality reduction, in International Conference on Machine Learning, 2004Google Scholar
  37. 37.
    S.T. Roweis, L.K. Saul, Nonlinear dimensionality reduction by locally linear embedding. Science 290 Google Scholar

Copyright information

© Springer International Publishing Switzerland 2017

Authors and Affiliations

  1. 1.University of BristolBristolUK
  2. 2.Royal Institute of TechnologyStockholmSweden

Personalised recommendations