Skip to main content

Grounding the Meaning of Words with Visual Attributes

  • Chapter
  • First Online:
Visual Attributes

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

Abstract

We address the problem of grounding representations of word meaning. Our approach learns higher level representations in a stacked autoencoder architecture from visual and textual input. The two input modalities are encoded as vectors of attributes and are obtained automatically from images and text. To obtain visual attributes (e.g. has_legs, is_yellow) from images, we train attribute classifiers by using our large-scale taxonomy of 600 visual attributes, representing more than 500 concepts and 700 K images. We extract textual attributes (e.g. bird, breed) from text with an existing distributional model. Experimental results on tasks related to word similarity show that the attribute-based vectors can be usefully integrated by our stacked autoencoder model to create bimodal representations which are overall more accurate than representations based on the individual modalities or different integration mechanisms (The work presented in this chapter is based on [89]).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We use the term word to denote any sequence of non-delimiting symbols.

  2. 2.

    We use the term concept to denote the mental representation of objects belonging to basic-level classes (e.g. dog), and the term category to refer to superordinate-level classes (e.g. animal).

  3. 3.

    By the term attributes we refer to semantic properties or characteristics of concepts (or categories), expressed by words which people would use to describe their meaning.

  4. 4.

    Available at http://homepages.inf.ed.ac.uk/csilbere/resources.html.

  5. 5.

    In the context of semantic representations, attributes are often called features or properties in the literature. For the sake of consistency of the present work, we will adhere to the former term.

  6. 6.

    They are often termed semantic feature production norms (e.g. [66]) or property norms (e.g. [24]) in the literature.

  7. 7.

    Available at http://www.image-net.org.

  8. 8.

    Available at http://homepages.inf.ed.ac.uk/s1151656/resources.html.

  9. 9.

    The code by [28] is available at http://vision.cs.uiuc.edu/attributes/ (last accessed in May 2015).

  10. 10.

    Threshold values ranged from 0 to 0.9 with 0.1 stepsize.

  11. 11.

    For simplicity, we use the symbol w to denote both, the concept and its index. Analogously, symbol a denotes the attribute and its index.

  12. 12.

    The software is available at http://clic.cimec.unitn.it/strudel/.

  13. 13.

    In a one-hot vector (a.k.a. 1-of-N coding), exactly one element is one and the others are zero. In our case, the non-zero element corresponds to the object label.

  14. 14.

    See [89] for more experiments.

  15. 15.

    Available at http://homepages.inf.ed.ac.uk/s1151656/resources.html.

  16. 16.

    The corpus is downloadable from http://wacky.sslmit.unibo.it/doku.php?id=corpora.

  17. 17.

    We performed random search over combinations of hyper-parameter values.

  18. 18.

    Available at http://w3.usf.edu/FreeAssociation.

  19. 19.

    We thank Elia Bruni for providing us with their data.

  20. 20.

    From http://wacky.sslmit.unibo.it/doku.php?id=corpora.

  21. 21.

    The vectors are available at https://code.google.com/p/word2vec/.

  22. 22.

    Available at http://homepages.inf.ed.ac.uk/s0897549/data/.

References

  1. Agirre, E., Soroa, A.: SemEval-2007 Task 02: Evaluating word sense induction and discrimination systems. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (2007)

    Google Scholar 

  2. Andrews, M., Vigliocco, G., Vinson, D.: Integrating experiential and distributional data to learn semantic representations. Psychol. Rev. 116(3), 463–498 (2009)

    Article  Google Scholar 

  3. Barbu, E.: Combining methods to learn feature-norm-like concept descriptions. In: Proceedings of the ESSLLI Workshop on Distributional Lexical Semantics (2008)

    Google Scholar 

  4. Baroni, M., Murphy, B., Barbu, E., Poesio, M.: Strudel: a corpus-based semantic model based on properties and types. Cogn. Sci. 34(2), 222–254 (2010)

    Article  Google Scholar 

  5. Barsalou, L.: Perceptual symbol systems. Behav. Brain Sci. 22, 577–609 (1999)

    Google Scholar 

  6. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. (JMLR) 3, 1137–1155 (2003)

    MATH  Google Scholar 

  8. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Conference on Neural Information Processing Systems (NIPS) (2006)

    Google Scholar 

  9. Biemann, C.: Chinese whispers—an efficient graph clustering algorithm and its application to natural language processing problems. In: Proceedings of TextGraphs: The 1st Workshop on Graph Based Methods for Natural Language Processing (2006)

    Google Scholar 

  10. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. (JMLR) 3, 993–1022 (2003)

    MATH  Google Scholar 

  11. Bornstein, M.H., Cote, L.R., Maital, S., Painter, K., Park, S.-Y., Pascual, L.: Cross-linguistic analysis of vocabulary in young children: Spanish, Dutch, French, Hebrew, Italian, Korean, and American English. Child Dev. 75(4), 1115–1139 (2004)

    Article  Google Scholar 

  12. Bruni, E., Tran, G., Baroni, M.: Distributional semantics from text and images. In: Proceedings of the GEMS 2011 workshop on geometrical models of natural language semantics (2011)

    Google Scholar 

  13. Bruni, E., Boleda, G., Baroni, M., Tran, N.: Distributional semantics in technicolor. In: Proceedings of the 50th annual meeting of the association for computational linguistics (2012)

    Google Scholar 

  14. Bruni, E., Bordignon, U., Liska, A., Uijlings, J., Sergienya, I.: VSEM: an open library for visual semantics representation. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations (2013)

    Google Scholar 

  15. Bruni, E., Tran, N., Baroni, M.: Multimodal distributional semantics. J. Artif. Intel. Res. (JAIR) 49, 1–47 (2014)

    MathSciNet  MATH  Google Scholar 

  16. Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: European Conference on Computer Vision (ECCV) (2012)

    Google Scholar 

  17. Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)

    Google Scholar 

  18. Collins, A.M., Loftus, E.F.: A spreading-activation theory of semantic processing. Psychol. Rev. 82(6), 407 (1975)

    Article  Google Scholar 

  19. Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: International Conference on Machine Learning (ICML) (2008)

    Google Scholar 

  20. Cree, G.S., McRae, K., McNorgan, C.: An attractor model of lexical conceptual processing: simulating semantic priming. Cogn. Sci. 23(3), 371–414 (1999)

    Article  Google Scholar 

  21. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  22. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)

    Google Scholar 

  23. Devereux, B., Pilkington, N., Poibeau, T., Korhonen, A.: Towards unrestricted, large-scale acquisition of feature-based conceptual representations from corpus data. Res. Lang. Comput. 7(2–4), 137–170 (2009)

    Article  Google Scholar 

  24. Devereux, B.J., Tyler, L.K., Geertzen, J., Randall, B.: The centre for speech, language and the brain (CSLB) concept property norms. Behav. Res. Methods (2013)

    Google Scholar 

  25. Duan, K., Parikh, D., Crandall, D., Grauman, K.: Discovering localized attributes for fine-grained recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

    Google Scholar 

  26. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge 2008 results (2008)

    Google Scholar 

  27. Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. (JMLR) 9, 1871–1874 (2008)

    MATH  Google Scholar 

  28. Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)

    Google Scholar 

  29. Fellbaum, C. (ed.) WordNet: an electronic lexical database. The MIT Press (1998)

    Google Scholar 

  30. Feng, F., Li, R., Wang, X.: Constructing hierarchical image-tags bimodal representations for word tags alternative choice. In: Proceedings of the ICML Workshop on Challenges in Representation Learning (2013)

    Google Scholar 

  31. Feng, Y., Lapata, M.: Visual information in semantic representation. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (2010)

    Google Scholar 

  32. Ferrari, V., Zisserman, A.: Learning visual attributes. In: Conference on Neural Information Processing Systems (NIPS) (2007)

    Google Scholar 

  33. Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., Ruppin, E.: Placing search in context: the concept revisited. ACM Trans. Inform. Syst. 20(1), 116–131 (2002)

    Article  Google Scholar 

  34. Fountain, T., Lapata, M.: Meaning representation in natural language categorization. In: Proceedings of the 31st Annual Conference of the Cognitive Science Society (2010)

    Google Scholar 

  35. Frermann, L., Lapata, M.: Incremental Bayesian learning of semantic categories. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (2014)

    Google Scholar 

  36. Glenberg, A.M., Kaschak, M.P.: Grounding language in action. Psychon. Bull. Rev. 9(3), 558–565 (2002)

    Article  Google Scholar 

  37. Goldstone, R.L., Kersten, A., Cavalho, P.F.: Concepts and categorization. In: Healy, A.F., Proctor, R.W. (eds.) Comprehensive Handbook of Psychology, vol. 4: Experimental Psychology, pp. 607–630. Wiley (2012)

    Google Scholar 

  38. Griffiths, T.L., Steyvers, M., Tenenbaum, J.B.: Topics in semantic representation. Psychol. Rev. 114(2), 211–244 (2007)

    Article  Google Scholar 

  39. Grondin, R., Lupker, S., Mcrae, K.: Shared features dominate semantic richness effects for concrete concepts. J. Mem. Lang. 60(1), 1–19 (2009)

    Article  Google Scholar 

  40. Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)

    Article  Google Scholar 

  41. Hill, F., Korhonen, A.: Learning abstract concept embeddings from multi-modal data: since you probably cant see what I mean. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (2014)

    Google Scholar 

  42. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  43. Hsu, A.S., Martin, J.B., Sanborn, A.N., Griffiths, T.L.: Identifying representations of categories of discrete items using Markov Chain Monte Carlo with people. In: Proceedings of the 34th annual conference of the cognitive science society (2012)

    Google Scholar 

  44. Huang, E.H., Socher, R., Manning, C.D., Ng, A.Y.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers (2012)

    Google Scholar 

  45. Huang, J., Kingsbury, B.: Audio-visual deep learning for noise robust speech recognition. In: Proceedings 38th International Conference on Acoustics, Speech, and Signal Processing (2013)

    Google Scholar 

  46. Huiskes, M.J., Lew, M.S.: The MIR Flickr retrieval evaluation. In: Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval (2008)

    Google Scholar 

  47. Johns, B.T., Jones, M.N.: Perceptual inference through global lexical similarity. Topics Cogn. Sci. 4(1), 103–120 (2012)

    Article  Google Scholar 

  48. Jones, M.N., Willits, J.A., Dennis, S.: Models of semantic memory. In: Busemeyer, J., Townsend, J., Wang, Z., Eidels, A. (eds.) The Oxford Handbook of Computational and Mathematical Psychology, pp. 232–254. Oxford University Press (2015)

    Google Scholar 

  49. Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  50. Kelly, C., Devereux, B., Korhonen, A.: Acquiring human-like feature-based conceptual representations from corpora. In: NAACL HLT Workshop on Computational Neurolinguistics (2010)

    Google Scholar 

  51. Kiela, D., Bottou, L.: Learning image embeddings using convolutional neural networks for improved multi-modal semantics. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (2014)

    Google Scholar 

  52. Kim, Y., Lee, H., Provost, E.M.: Deep learning for robust feature generation in audiovisual emotion recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2013)

    Google Scholar 

  53. Kiros, R., Salakhutdinov, R., Zemel, R.: Unifying visual-semantic embeddings with multimodal neural language models. NIPS. In: Deep Learning and Representation Learning Workshop (2014)

    Google Scholar 

  54. Kumar, N., Belhumeur, P.N., Nayar, S.K.: FaceTracer: a search engine for large collections of images with faces. In: European Conference on Computer Vision (ECCV) (2008)

    Google Scholar 

  55. Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Describable visual attributes for face verification and image search. IEEE Trans. pattern Anal. Mach. Intel. (PAMI) 33(10), 1962–1977 (2011)

    Article  Google Scholar 

  56. Laffont, P.-Y., Ren, Z., Tao, X., Qian, C., Hays, J.: Transient attributes for high-level understanding and editing of outdoor scenes. ACM Trans. Graph. 33(4), 149:1–149:11 (2014)

    Google Scholar 

  57. Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)

    Google Scholar 

  58. Landau, B., Smith, L., Jones, S.: Object perception and object naming in early development. Trends Cogn. Sci. 2(1), 19–24 (1998)

    Article  Google Scholar 

  59. Landauer, T., Dumais, S.T.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)

    Article  Google Scholar 

  60. Lazaridou, A., Pham, N.T., Baroni, M.: Combining language and vision with a multimodal skip-gram model. In: Human Language Technologies: The 2015 Annual Conference of the North American Chapter of the Association for Computational Linguistics (2015)

    Google Scholar 

  61. Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)

    Google Scholar 

  62. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision (IJCV) 60(2), 91–110 (2004)

    Article  Google Scholar 

  63. Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Methods Instrum. Comput. 28(2), 203–208 (1996)

    Article  Google Scholar 

  64. Mao, J., Xu, W., Yang, Y., Wang, J., Yuille, A.L.: Explain images with multimodal recurrent neural networks. In: Deep Learning and Representation Learning Workshop: NIPS (2014)

    Google Scholar 

  65. McRae, K., Jones, M.: Semantic memory. In: Reisberg, D. (ed.) The Oxford Handbook of Cognitive Psychology. Oxford University Press (2013)

    Google Scholar 

  66. McRae, K., Cree, G.S., Seidenberg, M.S., McNorgan, C.: Semantic feature production norms for a large set of living and nonliving things. Behav. Res. Methods 37(4), 547–559 (2005)

    Article  Google Scholar 

  67. Medin, D.L., Schaffer, M.M.: Context theory of classification learning. Psychol. Rev. 85(3), 207–238 (1978)

    Article  Google Scholar 

  68. Mervis, C.B., Rosch, E.: Categorization of natural objects. Annu. Rev. Psychol. 32(1), 89–115 (1981)

    Article  Google Scholar 

  69. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Conference on Neural Information Processing Systems (NIPS) (2013)

    Google Scholar 

  70. Mnih, A., Hinton, G.E.: A scalable hierarchical distributed language model. In: Conference on Neural Information Processing Systems (NIPS) (2009)

    Google Scholar 

  71. Nelson, D.L., McEvoy, C.L., Schreiber, T.A.: The University of South Florida Word Association, Rhyme, and Word Fragment Norms (1998)

    Google Scholar 

  72. Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.: Multimodal deep learning. In: International Conference on Machine Learning (ICML) (2011)

    Google Scholar 

  73. O’Connor, C.M., Cree, G.S., McRae, K.: Conceptual hierarchies in a flat attractor network: dynamics of learning and computations. Cogn. Sci. 33(4), 665–708 (2009)

    Article  Google Scholar 

  74. Osherson, D.N., Stern, J., Wilkie, O., Stob, M., Smith, E.E.: Default probability. Cogn. Sci. 2(15), 251–269 (1991)

    Article  Google Scholar 

  75. Parikh, D., Grauman, K.: Relative attributes. In: International Conference on Computer Vision (ICCV) (2011)

    Google Scholar 

  76. Patterson, G., Xu, C., Su, H., Hays, J.: The SUN attribute database: beyond categories for deeper scene understanding. Int. J. Comput. Vision (IJCV) 108(1–2), 59–81 (2014)

    Article  Google Scholar 

  77. Patwardhan, S., Pedersen, T.: Using WordNet-based context vectors to estimate the semantic relatedness of concepts. In: Proceedings of the EACL 2006 Workshop on Making Sense of Sense: Bringing Computational Linguistics and Psycholinguistics Together (2006)

    Google Scholar 

  78. Perfetti, C.: The limits of co-occurrence: tools and theories in language research. Discourse Processes 25(2&3), 363–377 (1998)

    Article  Google Scholar 

  79. Ranzato, M., Szummer, M.: Semi-supervised learning of compact document representations with deep networks. In: International Conference on Machine Learning (ICML) (2008)

    Google Scholar 

  80. Ranzato, M., Poultney, C., Chopra, S., LeCun, Y.: Efficient learning of sparse representations with an energy-based model. In: Conference on Neural Information Processing Systems (NIPS) (2006)

    Google Scholar 

  81. Rastegari, M., Diba, A., Parikh, D., Farhadi, A.: Multi-attribute queries: to merge or not to merge? In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)

    Google Scholar 

  82. Rogers, T.T., McClelland, J.L.: Semantic Cognition: A Parallel Distributed Processing Approach. A Parallel Distributed Processing Approach. The MIT Press (2004)

    Google Scholar 

  83. Rogers, T.T., Lambon Ralph, M.A., Garrard, P., Bozeat, S., McClelland, J.L., Hodges, J.R., Patterson, K.: Structure and deterioration of semantic memory: a neuropsychological and computational investigation. Psychol. Rev. 111(1), 205–235 (2004)

    Google Scholar 

  84. Roller, S., Schulte im Walde, S.: A Multimodal LDA model integrating textual, cognitive and visual modalities. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (2013)

    Google Scholar 

  85. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1: Foundations, pp. 318–362. The MIT Press (1986)

    Google Scholar 

  86. Russakovsky, O., Fei-Fei, L.: Attribute learning in large-scale datasets. In: ECCV International Workshop on Parts and Attributes (2010)

    Google Scholar 

  87. Russell, B., Torralba, A., Murphy, K., Freeman, W.: LabelMe: a database and web-based tool for image annotation. Int. J. Comput. Vis. (IJCV) 77, 157–173 (2008)

    Article  Google Scholar 

  88. Salton, G., McGill, M.J.: Introduction to modern information retrieval. McGraw-Hill, Inc. (1986)

    Google Scholar 

  89. Silberer, C.: Learning Visually Grounded Meaning Representations. Ph.D. thesis, Institute for Language, Cognition and Computation, School of Informatics, The University of Edinburgh (2015)

    Google Scholar 

  90. Silberer, C., Lapata, M.: Grounded models of semantic representation. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (2012)

    Google Scholar 

  91. Sloman, S.A., Love, B.C., Ahn, W.-K.: Feature centrality and conceptual coherence. Cogn. Sci. 22(2), 189–228 (1998)

    Article  Google Scholar 

  92. Smith, E.E., Shoben, E.J., Rips, L.J.: Structure and process in semantic memory: a featural model for semantic decisions. Psychol. Rev. 81(3), 214–241 (1974)

    Article  Google Scholar 

  93. Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., and Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2011)

    Google Scholar 

  94. Socher, R., Karpathy, A., Le, Q.V., Manning, C., Ng, A.: Grounded compositional semantics for finding and describing images with sentences. Trans. Assoc. Comput. Linguist. 2, 207–218 (2014)

    Google Scholar 

  95. Sohn, K., Shang, W., Lee, H.: Improved multimodal deep learning with variation of information. In: Conference on Neural Information Processing Systems (NIPS) (2014)

    Google Scholar 

  96. Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep Boltzmann machines. In: Conference on Neural Information Processing Systems (NIPS) (2012)

    Google Scholar 

  97. Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep Boltzmann machines. J. Mach. Learn. Res. (JMLR) 15, 2949–2980 (2014)

    MathSciNet  MATH  Google Scholar 

  98. Szumlanski, S., Gomez, F., Sims, V.K.: A new set of norms for semantic relatedness measures. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (2013)

    Google Scholar 

  99. Taylor, K.I., Devereux, B.J., Acres, K., Randall, B., Tyler, L.K.: Contrasting effects of feature-based statistics on the categorisation and basic-level identification of visual objects. Cognition 122(3), 363–374 (2012)

    Article  Google Scholar 

  100. Turney, P.D., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37(1), 141–188 (2010)

    MathSciNet  MATH  Google Scholar 

  101. Tyler, L.K., Moss, H.E.: Towards a distributed account of conceptual knowledge. TRENDS Cogn. Sci. 5(6), 244–252 (2001)

    Article  Google Scholar 

  102. Vanpaemel, W., Storms, G., Ons, B.: A varying abstraction model for categorization. In: Proceedings of the 27th Annual Conference of the Cognitive Science Society (2005)

    Google Scholar 

  103. Varma, M., Zisserman, A.: A statistical approach to texture classification from single images. Int. J. Comput. Vis. (IJCV) (Special Issue on Texture Analysis and Synthesis) 62(1–2), pp. 61–81 (2005)

    Google Scholar 

  104. Vigliocco, G., Vinson, D.P., Lewis, W., Garrett, M.F.: Representing the meanings of object and action words: the featural and unitary semantic space hypothesis. Cogn. Psychol. 48(4), 422–488 (2004)

    Article  Google Scholar 

  105. Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.-A.: Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning (ICML) (2008)

    Google Scholar 

  106. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. (JMLR) 11, 3371–3408 (2010)

    MathSciNet  MATH  Google Scholar 

  107. Vinson, D.P., Vigliocco, G.: Semantic feature production norms for a large set of objects and events. Behav. Res. Methods 40(1), 183–190 (2008)

    Article  Google Scholar 

  108. von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Conference on Human Factors in Computing Systems (2004)

    Google Scholar 

  109. Voorspoels, W., Vanpaemel, W., Storms, G.: Exemplars and prototypes in natural language concepts: a typicality-based evaluation. Psychon. Bull. Rev. 15, 630–637 (2008)

    Article  Google Scholar 

  110. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology (2011)

    Google Scholar 

  111. Westermann, G., Mareschal, D.: From perceptual to language-mediated categorization. Philos. Trans. R Soc. B: Biol. Sci. 369(1634), 20120391 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carina Silberer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Silberer, C. (2017). Grounding the Meaning of Words with Visual Attributes. In: Feris, R., Lampert, C., Parikh, D. (eds) Visual Attributes. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-50077-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50077-5_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50075-1

  • Online ISBN: 978-3-319-50077-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics