Skip to main content

Multi-modal Learning

  • Chapter
Cognitive Systems

Introduction

The main topic of this chapter is learning, more specifically, multimodal learning.

In biological systems, learning occurs in various forms and at various developmental stages facilitating adaptation to the ever changing environment. Learning is also one of the most fundamental capabilities of an artificial cognitive system, thus significant efforts have been dedicated in CoSy to researching a variety of issues related to it.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Fidler, S., Skočaj, D., Leonardis, A.: Combining reconstructive and discriminative subspace methods for robust classification and regression by subsampling. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(3), 337–350 (2006), http://cognitivesystems.org/CoSyBook/chap7.asp#fidlerPAMI06

    Article  Google Scholar 

  2. Harnad, S.: The symbol grounding problem. Physica D: Nonlinear Phenomena 42, 335–346 (1990)

    Article  Google Scholar 

  3. Ardizzone, E., Chella, A., Frixione, M., Gaglio, S.: Integrating subsymbolic and symbolic processing in artificial vision. Journal of Intelligent Systems 1(4), 273–308 (1992)

    Google Scholar 

  4. Chella, A., Frixione, M., Gaglio, S.: A cognitive architecture for artificial vision. Artificial Intelligence 89(1–2), 73–111 (1997)

    Article  MATH  Google Scholar 

  5. Roy, D.K., Pentland, A.P.: Learning words from sights and sounds: a computational model. Cognitive Science 26(1), 113–146 (2002)

    Article  Google Scholar 

  6. Roy, D.K.: Learning visually-grounded words and syntax for a scene description task. Computer Speech and Language 16(3), 353–385 (2002)

    Article  Google Scholar 

  7. Steels, L., Vogt, P.: Grounding adaptive language games in robotic agents. In: Proceedings of the Fourth European Conference on Artificial Life, ECAL 1997, Complex Adaptive Systems, pp. 474–482 (1997)

    Google Scholar 

  8. Vogt, P.: The physical symbol grounding problem. Cognitive Systems Research 3(3), 429–457 (2002)

    Article  Google Scholar 

  9. Bauckhage, C., Fink, G., Fritsch, J., Kummert, F., Lömker, F., Sagerer, G., Wachsmuth, S.: An integrated system for cooperative man-machine interaction. In: IEEE International Symposium on Computational Intelligence in Robotics and Automation, pp. 328–333 (2001)

    Google Scholar 

  10. Kirstein, S., Wersing, H., Körner, E.: Rapid online learning of objects in a biologically motivated recognition architecture. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds.) DAGM 2005. LNCS, vol. 3663, pp. 301–308. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Steels, L., Kaplan, F.: AIBO’s first words. the social learning of language and meaning. Evolution of Communication 4(1), 3–32 (2001)

    Article  Google Scholar 

  12. Arsenio, A.: Developmental learning on a humanoid robot. In: IEEE International Joint Conference on Neural Networks, pp. 3167–3172 (2004)

    Google Scholar 

  13. Pollard, D.E.: A user’s guide to measure theoretic probability. Cambridge University Press, Cambridge (2002)

    MATH  Google Scholar 

  14. Kristan, M., Skočaj, D., Leonardis, A.: Online kernel density estimation for interactive learning (submitted for publication), http://cognitivesystems.org/CoSyBook/chap7.asp#KristanIMAVIS2008

  15. Wand, M.P., Jones, M.C.: Kernel Smoothing. Chapman & Hall/CRC (1995)

    Google Scholar 

  16. Scott, D.W., Szewczyk, W.F.: From kernels to mixtures. Technometrics 43(3), 323–335 (2001)

    Article  MathSciNet  Google Scholar 

  17. Goldberger, J., Roweis, S.: Hierarchical clustering of a mixture model. In: Neural Inf. Proc. Systems, pp. 505–512 (2005)

    Google Scholar 

  18. Zhang, K., Kwok, J.T.: Simplifying mixture models through function approximation. In: Neural Inf. Proc. Systems (2006)

    Google Scholar 

  19. Mc Lachlan, G.J., Krishan, T.: The EM algorithm and extensions. Wiley, Chichester (1997)

    Google Scholar 

  20. Figueiredo, M.A.F., Jain, A.K.: Unsupervised learning of finite mixture models. IEEE Trans. Patter. Anal. Mach. Intell. 24(3), 381–396 (2002)

    Article  Google Scholar 

  21. Živkovič, Z., van der Heijden, F.: Recursive unsupervised learning of finite mixture models. IEEE Trans. Patter. Anal. Mach. Intell. 26(5), 651–656 (2004)

    Article  Google Scholar 

  22. Corduneanu, A., Bishop, C.M.: Variational Bayesian model selection for mixture distributions. In: Artificial Intelligence and Statistics, pp. 27–34. Morgan Kaufmann, Los Altos (2001)

    Google Scholar 

  23. McGrory, C.A., Titterington, D.M.: Variational approximations in Bayesian model selection for finite mixture distributions. Comput. Stat. Data Analysis 51(11), 5352–5367 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  24. Song, M., Wang, H.: Highly efficient incremental estimation of gaussian mixture models for online data stream clustering. In: SPIE: Intelligent Computing: Theory and Applications, pp. 174–183 (2005)

    Google Scholar 

  25. Arandjelović, O., Cipolla, R.: Incremental learning of temporally-coherent gaussian mixture models. In: British Machine Vision Conference, pp. 759–768 (2005)

    Google Scholar 

  26. Szewczyk, W.F.: Time-evolving adaptive mixtures, Tech. rep., National Security Agency (2005)

    Google Scholar 

  27. Declercq, A., Piater, J.H.: Online learning of gaussian mixture models - a two-level approach. In: Intl.l Conf. Comp. Vis., Imaging and Comp. Graph. Theory and Applications, pp. 605–611 (2008)

    Google Scholar 

  28. Han, B., Comaniciu, D., Zhu, Y., Davis, L.S.: Sequential kernel density approximation and its application to real-time visual tracking. IEEE Trans. Patter. Anal. Mach. Intell. 30(7), 1186–1197 (2008)

    Article  Google Scholar 

  29. Kristan, M., Skočaj, D., Leonardis, A.: Incremental learning with Gaussian mixture models. In: Computer Vision Winter Workshop CVWW 2008, Moravske toplice, Slovenia, pp. 25–32 (2008), http://cognitivesystems.org/CoSyBook/chap7.asp#kristanCVWW08

  30. Girolami, M., He, C.: Probability density estimation from optimally condensed data samples. IEEE Trans. Patter. Anal. Mach. Intell. 25(10), 1253–1264 (2003)

    Article  Google Scholar 

  31. Jones, M.C., Marron, J.S., Sheather, S.J.: A brief survey of bandwidth selection for density estimation. J. Amer. Stat. Assoc. 91(433), 401–407 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  32. Skočaj, D., Berginc, G., Ridge, B., Štimec, A., Jogan, M., Vanek, O., Leonardis, A., Hutter, M., Hewes, N.: A system for continuous learning of visual concepts. In: International Conference on Computer Vision Systems ICVS 2007, Bielefeld, Germany (2007), http://cognitivesystems.org/CoSyBook/chap7.asp#skocajICVS07

  33. Skočaj, D., Ridge, B., Berginc, G., Leonardis, A.: A framework for continuous learning of simple visual concepts. In: Computer Vision Winter Workshop 2007, St. Lambrecht, Austria, pp. 99–105 (2007), http://cognitivesystems.org/CoSyBook/chap7.asp#skocajCVWW07

  34. Skočaj, D., Kristan, M., Leonardis, A.: Continuous learning of simple visual concepts using incremental kernel density estimation. In: International Conference on Computer Vision Theory and Applications, Funchal, Madeira, Portugal, pp. 598–604 (2008), http://cognitivesystems.org/CoSyBook/chap7.asp#skocajVISAPP08

  35. Lowe, D.: Object recognition from local scale invariant features. In: ICCV 1999 (1999)

    Google Scholar 

  36. Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. In: CVPR 2003 (2003)

    Google Scholar 

  37. Mikolajczyk, K., Leibe, B., Schiele, B.: Local features for object class recognition. In: ICCV 2005, Beijing, China (2005)

    Google Scholar 

  38. Csurka, G., Dance, C., Fan, L., Willarnowski, J., Bray, C.: Visual categorization with bags of keypoints. In: SLCV (2004)

    Google Scholar 

  39. Leibe, B., Seemann, E., Schiele, B.: Pedestrian detection in crowded scenes. In: CVPR 2005, San Diego, CA, USA (2005)

    Google Scholar 

  40. Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their locations in images. In: ICCV 2005, Beijing, China (2005)

    Google Scholar 

  41. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR 2006, pp. 2169–2178 (2006)

    Google Scholar 

  42. Agarwal, A., Triggs, B.: Hyperfeatures - multilevel local coding for visual recognition. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 30–43. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  43. Fritz, M., Schiele, B.: Towards unsupervised discovery of visual categories. In: Franke, K., Müller, K.-R., Nickolay, B., Schäfer, R. (eds.) DAGM 2006. LNCS, vol. 4174, pp. 232–241. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  44. Grauman, K., Darrell, T.: Unsupervised learning of categories from sets of partially matching image features. In: CVPR 2006, pp. 19–25. IEEE Computer Society, Washington (2006)

    Google Scholar 

  45. Baldridge, J., Kruijff, G.-J.M.: Multi-modal combinatory categorial grammar. In: EACL 2003, Morristown, NJ, USA (2003)

    Google Scholar 

  46. Baldridge, J., Kruijff, G.-J.M.: Coupling ccg and hybrid logic dependency semantics. In: ACL 2002, Morristown, NJ, USA (2001)

    Google Scholar 

  47. Roy, D.: Learning words and syntax for a scene description task. Computer Speech and Language 16(3)

    Google Scholar 

  48. Kruijff, G.-J.M., Kelleher, J.D., Berginc, G., Leonardis, A.: Structural descriptions in Human-Assisted robot visual learning. In: Proceedings of 1st Annual Conference on Human-Robot Interaction (2006)

    Google Scholar 

  49. Kruijff, G.-J.M., Kelleher, J.D., Hawes, N.: Information fusion for visual reference resolution in dynamic situated dialogue. In: André, E., Dybkjær, L., Minker, W., Neumann, H., Weber, M. (eds.) PIT 2006. LNCS (LNAI), vol. 4021, pp. 117–128. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  50. Kelleher, J., Kruijff, G.-J., Costello, F.: Proximity in context: an empirically grounded computational model of proximity for processing topological spatial expression. In: Coling-ACL 2006, Sydney Australia (2006)

    Google Scholar 

  51. Brand, M., Oliver, N., Pentland, A.: Coupled hidden markov models for complex action recognition. In: IEEE Proceedings of Computer Vision and Pattern Recognition, Puerto Rico, USA (1997)

    Google Scholar 

  52. Wren, C., Pentland, A.: Dynamic modeling of human motion. In: Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan (1998)

    Google Scholar 

  53. Hongeng, S., Wyatt, J.: Learning causality and intention in human actions. In: Proceedings of IEEE-RAS International Conference on Humanoid Robots, Genoa, France (2006), http://cognitivesystems.org/CoSyBook/chap7.asp#hong06

  54. Sutton, R.S., Barto, A.G.: Reinforcement learning : An introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  55. Domingos, P., Richardson, M.: Markov logic: A unifying framework for statistical relational learning. In: Proceedings of the ICML 2004 Workshop on Statistical Relational Learning and its Connection to Other Fields, Banff, Canada (2004)

    Google Scholar 

  56. Hongeng, S., Wyatt, J.: Learning Causality and Intentional Actions. In: Rome, E., Hertzberg, J., Dorffner, G. (eds.) Towards Affordance-Based Robot Control. LNCS (LNAI), vol. 4760, pp. 27–46. Springer, Heidelberg (2008), http://cognitivesystems.org/CoSyBook/chap7.asp#hong08a

    Chapter  Google Scholar 

  57. Hongeng, S., Wyatt, J.: Learning goal-based motion sequences of object manipulation, Tech. Rep. CSR-08-02, School of Computer Science, University of Birmingham (2008)

    Google Scholar 

  58. Glymour, C.: Learning causes : Psychological explanations of causal explanation. Minds and Machines 8, 39–60 (1998)

    Article  Google Scholar 

  59. Gergely, G., Csibra, G.: Teleological reasoning in infancy: the naive theory of rational action. Trends in Cognitive Sciences 7(7), 287–292 (2003)

    Article  Google Scholar 

  60. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC 2007) Results (2007)

    Google Scholar 

  61. Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset, Tech. Rep. 7694, California Institute of Technology (2007)

    Google Scholar 

  62. Rosch, E., Mervis, C.B., Gray, W.D., Johnson, D.M., Braem, P.B.: Basic objects in natural categories. Cognitive Psychology

    Google Scholar 

  63. Gibson, J.J.: The theory of affordance, in: Percieving, Acting, and Knowing. Lawrence Erlbaum Associates, Hillsdale (1977)

    Google Scholar 

  64. Winston, P.H., Katz, B., Binford, T.O., Lowry, M.R.: Learning physical descriptions from functional definitions, examples, and precedents. In: AAAI 1983 (1983)

    Google Scholar 

  65. Stark, L., Bowyer, K.: Achieving generalized object recognition through reasoning about association of function to structure. PAMI 13(10), 1097–1104 (1991)

    Google Scholar 

  66. Stark, L., Hoover, A., Goldgof, D., Bowyer, K.: Function-based recognition from incomplete knowledge of shape. In: WQV 1993, pp. 11–22 (1993)

    Google Scholar 

  67. Rivlin, E., Dickinson, S.J., Rosenfeld, A.: Recognition by functional parts. Computer Vision and Image Understanding: CVIU 62(2), 164–176 (1995)

    Article  MATH  Google Scholar 

  68. Bogoni, L., Bajcsy, R.: Interactive recognition and representation of functionality. CVIU 62(2), 194–214 (1995)

    MATH  Google Scholar 

  69. Saxena, A., Driemeyer, J., Ng, A.Y.: Robotic grasping of novel objects using vision. IJRR

    Google Scholar 

  70. Stark, M., Lies, P., Zillich, M., Wyatt, J., Schiele, B.: Functional object class detection based on learned affordance cues. In: 6th International Conference on Computer Vision Systems, ICVS (2008), http://cognitivesystems.org/CoSyBook/chap7.asp#stark08icvs

  71. Sun, J., Zhang, W.W., Tang, X., Shum, H.Y.: Background cut. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part II. LNCS, vol. 3952, pp. 628–641. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  72. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML 2001 (2001)

    Google Scholar 

  73. Jones, M.J., Rehg, J.M.: Statistical color models with application to skin detection. In: CVPR, pp. 1274–1280. IEEE Computer Society, Los Alamitos (1999)

    Google Scholar 

  74. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)

    Article  Google Scholar 

  75. Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Gool, L.J.V.: A comparison of affine region detectors. In: IJCV 2005 (2005)

    Google Scholar 

  76. Ferrari, V., Fevrier, L., Jurie, F., Schmid, C.: Groups of adjacent contour segments for object detection, Rapport De Recherche Inria

    Google Scholar 

  77. Ferrari, V., Tuytelaars, T., Gool, L.J.V.: Object detection by contour segment networks. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 14–28. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  78. Stark, M., Schiele, B.: How good are local features for classes of geometric objects. In: ICCV (2007), http://cognitivesystems.org/CoSyBook/chap7.asp#stark07iccv

  79. Zillich, M.: Incremental Indexing for Parameter-Free Perceptual Grouping. In: 31st Workshop of the Austrian Association for Pattern Recognition (2007)

    Google Scholar 

  80. Leibe, B., Leonardis, A., Schiele, B.: An implicit shape model for combined object categorization and segmentation. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds.) Toward Category-Level Object Recognition. LNCS, vol. 4170, pp. 508–524. Springer, Heidelberg (2006), http://cognitivesystems.org/CoSyBook/chap7.asp#Leibe06b

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Skočaj, D. et al. (2010). Multi-modal Learning. In: Christensen, H.I., Kruijff, GJ.M., Wyatt, J.L. (eds) Cognitive Systems. Cognitive Systems Monographs, vol 8. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11694-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-11694-0_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-11693-3

  • Online ISBN: 978-3-642-11694-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics