Grounding the Meaning of Words with Visual Attributes

Silberer, Carina

doi:10.1007/978-3-319-50077-5_13

Carina Silberer⁵

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

1440 Accesses
1 Citations

Abstract

We address the problem of grounding representations of word meaning. Our approach learns higher level representations in a stacked autoencoder architecture from visual and textual input. The two input modalities are encoded as vectors of attributes and are obtained automatically from images and text. To obtain visual attributes (e.g. has_legs, is_yellow) from images, we train attribute classifiers by using our large-scale taxonomy of 600 visual attributes, representing more than 500 concepts and 700 K images. We extract textual attributes (e.g. bird, breed) from text with an existing distributional model. Experimental results on tasks related to word similarity show that the attribute-based vectors can be usefully integrated by our stacked autoencoder model to create bimodal representations which are overall more accurate than representations based on the individual modalities or different integration mechanisms (The work presented in this chapter is based on [89]).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We use the term word to denote any sequence of non-delimiting symbols.
2.
We use the term concept to denote the mental representation of objects belonging to basic-level classes (e.g. dog), and the term category to refer to superordinate-level classes (e.g. animal).
3.
By the term attributes we refer to semantic properties or characteristics of concepts (or categories), expressed by words which people would use to describe their meaning.
4.
Available at http://homepages.inf.ed.ac.uk/csilbere/resources.html.
5.
In the context of semantic representations, attributes are often called features or properties in the literature. For the sake of consistency of the present work, we will adhere to the former term.
6.
They are often termed semantic feature production norms (e.g. [66]) or property norms (e.g. [24]) in the literature.
7.
Available at http://www.image-net.org.
8.
Available at http://homepages.inf.ed.ac.uk/s1151656/resources.html.
9.
The code by [28] is available at http://vision.cs.uiuc.edu/attributes/ (last accessed in May 2015).
10.
Threshold values ranged from 0 to 0.9 with 0.1 stepsize.
11.
For simplicity, we use the symbol w to denote both, the concept and its index. Analogously, symbol a denotes the attribute and its index.
12.
The software is available at http://clic.cimec.unitn.it/strudel/.
13.
In a one-hot vector (a.k.a. 1-of-N coding), exactly one element is one and the others are zero. In our case, the non-zero element corresponds to the object label.
14.
See [89] for more experiments.
15.
Available at http://homepages.inf.ed.ac.uk/s1151656/resources.html.
16.
The corpus is downloadable from http://wacky.sslmit.unibo.it/doku.php?id=corpora.
17.
We performed random search over combinations of hyper-parameter values.
18.
Available at http://w3.usf.edu/FreeAssociation.
19.
We thank Elia Bruni for providing us with their data.
20.
From http://wacky.sslmit.unibo.it/doku.php?id=corpora.
21.
The vectors are available at https://code.google.com/p/word2vec/.
22.
Available at http://homepages.inf.ed.ac.uk/s0897549/data/.

References

Agirre, E., Soroa, A.: SemEval-2007 Task 02: Evaluating word sense induction and discrimination systems. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (2007)
Google Scholar
Andrews, M., Vigliocco, G., Vinson, D.: Integrating experiential and distributional data to learn semantic representations. Psychol. Rev. 116(3), 463–498 (2009)
Article Google Scholar
Barbu, E.: Combining methods to learn feature-norm-like concept descriptions. In: Proceedings of the ESSLLI Workshop on Distributional Lexical Semantics (2008)
Google Scholar
Baroni, M., Murphy, B., Barbu, E., Poesio, M.: Strudel: a corpus-based semantic model based on properties and types. Cogn. Sci. 34(2), 222–254 (2010)
Article Google Scholar
Barsalou, L.: Perceptual symbol systems. Behav. Brain Sci. 22, 577–609 (1999)
Google Scholar
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MathSciNet MATH Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. (JMLR) 3, 1137–1155 (2003)
MATH Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Conference on Neural Information Processing Systems (NIPS) (2006)
Google Scholar
Biemann, C.: Chinese whispers—an efficient graph clustering algorithm and its application to natural language processing problems. In: Proceedings of TextGraphs: The 1st Workshop on Graph Based Methods for Natural Language Processing (2006)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. (JMLR) 3, 993–1022 (2003)
MATH Google Scholar
Bornstein, M.H., Cote, L.R., Maital, S., Painter, K., Park, S.-Y., Pascual, L.: Cross-linguistic analysis of vocabulary in young children: Spanish, Dutch, French, Hebrew, Italian, Korean, and American English. Child Dev. 75(4), 1115–1139 (2004)
Article Google Scholar
Bruni, E., Tran, G., Baroni, M.: Distributional semantics from text and images. In: Proceedings of the GEMS 2011 workshop on geometrical models of natural language semantics (2011)
Google Scholar
Bruni, E., Boleda, G., Baroni, M., Tran, N.: Distributional semantics in technicolor. In: Proceedings of the 50th annual meeting of the association for computational linguistics (2012)
Google Scholar
Bruni, E., Bordignon, U., Liska, A., Uijlings, J., Sergienya, I.: VSEM: an open library for visual semantics representation. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations (2013)
Google Scholar
Bruni, E., Tran, N., Baroni, M.: Multimodal distributional semantics. J. Artif. Intel. Res. (JAIR) 49, 1–47 (2014)
MathSciNet MATH Google Scholar
Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: European Conference on Computer Vision (ECCV) (2012)
Google Scholar
Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Google Scholar
Collins, A.M., Loftus, E.F.: A spreading-activation theory of semantic processing. Psychol. Rev. 82(6), 407 (1975)
Article Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: International Conference on Machine Learning (ICML) (2008)
Google Scholar
Cree, G.S., McRae, K., McNorgan, C.: An attractor model of lexical conceptual processing: simulating semantic priming. Cogn. Sci. 23(3), 371–414 (1999)
Article Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41(6), 391–407 (1990)
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
Google Scholar
Devereux, B., Pilkington, N., Poibeau, T., Korhonen, A.: Towards unrestricted, large-scale acquisition of feature-based conceptual representations from corpus data. Res. Lang. Comput. 7(2–4), 137–170 (2009)
Article Google Scholar
Devereux, B.J., Tyler, L.K., Geertzen, J., Randall, B.: The centre for speech, language and the brain (CSLB) concept property norms. Behav. Res. Methods (2013)
Google Scholar
Duan, K., Parikh, D., Crandall, D., Grauman, K.: Discovering localized attributes for fine-grained recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge 2008 results (2008)
Google Scholar
Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. (JMLR) 9, 1871–1874 (2008)
MATH Google Scholar
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
Google Scholar
Fellbaum, C. (ed.) WordNet: an electronic lexical database. The MIT Press (1998)
Google Scholar
Feng, F., Li, R., Wang, X.: Constructing hierarchical image-tags bimodal representations for word tags alternative choice. In: Proceedings of the ICML Workshop on Challenges in Representation Learning (2013)
Google Scholar
Feng, Y., Lapata, M.: Visual information in semantic representation. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (2010)
Google Scholar
Ferrari, V., Zisserman, A.: Learning visual attributes. In: Conference on Neural Information Processing Systems (NIPS) (2007)
Google Scholar
Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., Ruppin, E.: Placing search in context: the concept revisited. ACM Trans. Inform. Syst. 20(1), 116–131 (2002)
Article Google Scholar
Fountain, T., Lapata, M.: Meaning representation in natural language categorization. In: Proceedings of the 31st Annual Conference of the Cognitive Science Society (2010)
Google Scholar
Frermann, L., Lapata, M.: Incremental Bayesian learning of semantic categories. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (2014)
Google Scholar
Glenberg, A.M., Kaschak, M.P.: Grounding language in action. Psychon. Bull. Rev. 9(3), 558–565 (2002)
Article Google Scholar
Goldstone, R.L., Kersten, A., Cavalho, P.F.: Concepts and categorization. In: Healy, A.F., Proctor, R.W. (eds.) Comprehensive Handbook of Psychology, vol. 4: Experimental Psychology, pp. 607–630. Wiley (2012)
Google Scholar
Griffiths, T.L., Steyvers, M., Tenenbaum, J.B.: Topics in semantic representation. Psychol. Rev. 114(2), 211–244 (2007)
Article Google Scholar
Grondin, R., Lupker, S., Mcrae, K.: Shared features dominate semantic richness effects for concrete concepts. J. Mem. Lang. 60(1), 1–19 (2009)
Article Google Scholar
Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)
Article Google Scholar
Hill, F., Korhonen, A.: Learning abstract concept embeddings from multi-modal data: since you probably cant see what I mean. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (2014)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Hsu, A.S., Martin, J.B., Sanborn, A.N., Griffiths, T.L.: Identifying representations of categories of discrete items using Markov Chain Monte Carlo with people. In: Proceedings of the 34th annual conference of the cognitive science society (2012)
Google Scholar
Huang, E.H., Socher, R., Manning, C.D., Ng, A.Y.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers (2012)
Google Scholar
Huang, J., Kingsbury, B.: Audio-visual deep learning for noise robust speech recognition. In: Proceedings 38th International Conference on Acoustics, Speech, and Signal Processing (2013)
Google Scholar
Huiskes, M.J., Lew, M.S.: The MIR Flickr retrieval evaluation. In: Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval (2008)
Google Scholar
Johns, B.T., Jones, M.N.: Perceptual inference through global lexical similarity. Topics Cogn. Sci. 4(1), 103–120 (2012)
Article Google Scholar
Jones, M.N., Willits, J.A., Dennis, S.: Models of semantic memory. In: Busemeyer, J., Townsend, J., Wang, Z., Eidels, A. (eds.) The Oxford Handbook of Computational and Mathematical Psychology, pp. 232–254. Oxford University Press (2015)
Google Scholar
Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Google Scholar
Kelly, C., Devereux, B., Korhonen, A.: Acquiring human-like feature-based conceptual representations from corpora. In: NAACL HLT Workshop on Computational Neurolinguistics (2010)
Google Scholar
Kiela, D., Bottou, L.: Learning image embeddings using convolutional neural networks for improved multi-modal semantics. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (2014)
Google Scholar
Kim, Y., Lee, H., Provost, E.M.: Deep learning for robust feature generation in audiovisual emotion recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2013)
Google Scholar
Kiros, R., Salakhutdinov, R., Zemel, R.: Unifying visual-semantic embeddings with multimodal neural language models. NIPS. In: Deep Learning and Representation Learning Workshop (2014)
Google Scholar
Kumar, N., Belhumeur, P.N., Nayar, S.K.: FaceTracer: a search engine for large collections of images with faces. In: European Conference on Computer Vision (ECCV) (2008)
Google Scholar
Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Describable visual attributes for face verification and image search. IEEE Trans. pattern Anal. Mach. Intel. (PAMI) 33(10), 1962–1977 (2011)
Article Google Scholar
Laffont, P.-Y., Ren, Z., Tao, X., Qian, C., Hays, J.: Transient attributes for high-level understanding and editing of outdoor scenes. ACM Trans. Graph. 33(4), 149:1–149:11 (2014)
Google Scholar
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
Google Scholar
Landau, B., Smith, L., Jones, S.: Object perception and object naming in early development. Trends Cogn. Sci. 2(1), 19–24 (1998)
Article Google Scholar
Landauer, T., Dumais, S.T.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)
Article Google Scholar
Lazaridou, A., Pham, N.T., Baroni, M.: Combining language and vision with a multimodal skip-gram model. In: Human Language Technologies: The 2015 Annual Conference of the North American Chapter of the Association for Computational Linguistics (2015)
Google Scholar
Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision (IJCV) 60(2), 91–110 (2004)
Article Google Scholar
Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Methods Instrum. Comput. 28(2), 203–208 (1996)
Article Google Scholar
Mao, J., Xu, W., Yang, Y., Wang, J., Yuille, A.L.: Explain images with multimodal recurrent neural networks. In: Deep Learning and Representation Learning Workshop: NIPS (2014)
Google Scholar
McRae, K., Jones, M.: Semantic memory. In: Reisberg, D. (ed.) The Oxford Handbook of Cognitive Psychology. Oxford University Press (2013)
Google Scholar
McRae, K., Cree, G.S., Seidenberg, M.S., McNorgan, C.: Semantic feature production norms for a large set of living and nonliving things. Behav. Res. Methods 37(4), 547–559 (2005)
Article Google Scholar
Medin, D.L., Schaffer, M.M.: Context theory of classification learning. Psychol. Rev. 85(3), 207–238 (1978)
Article Google Scholar
Mervis, C.B., Rosch, E.: Categorization of natural objects. Annu. Rev. Psychol. 32(1), 89–115 (1981)
Article Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Conference on Neural Information Processing Systems (NIPS) (2013)
Google Scholar
Mnih, A., Hinton, G.E.: A scalable hierarchical distributed language model. In: Conference on Neural Information Processing Systems (NIPS) (2009)
Google Scholar
Nelson, D.L., McEvoy, C.L., Schreiber, T.A.: The University of South Florida Word Association, Rhyme, and Word Fragment Norms (1998)
Google Scholar
Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.: Multimodal deep learning. In: International Conference on Machine Learning (ICML) (2011)
Google Scholar
O’Connor, C.M., Cree, G.S., McRae, K.: Conceptual hierarchies in a flat attractor network: dynamics of learning and computations. Cogn. Sci. 33(4), 665–708 (2009)
Article Google Scholar
Osherson, D.N., Stern, J., Wilkie, O., Stob, M., Smith, E.E.: Default probability. Cogn. Sci. 2(15), 251–269 (1991)
Article Google Scholar
Parikh, D., Grauman, K.: Relative attributes. In: International Conference on Computer Vision (ICCV) (2011)
Google Scholar
Patterson, G., Xu, C., Su, H., Hays, J.: The SUN attribute database: beyond categories for deeper scene understanding. Int. J. Comput. Vision (IJCV) 108(1–2), 59–81 (2014)
Article Google Scholar
Patwardhan, S., Pedersen, T.: Using WordNet-based context vectors to estimate the semantic relatedness of concepts. In: Proceedings of the EACL 2006 Workshop on Making Sense of Sense: Bringing Computational Linguistics and Psycholinguistics Together (2006)
Google Scholar
Perfetti, C.: The limits of co-occurrence: tools and theories in language research. Discourse Processes 25(2&3), 363–377 (1998)
Article Google Scholar
Ranzato, M., Szummer, M.: Semi-supervised learning of compact document representations with deep networks. In: International Conference on Machine Learning (ICML) (2008)
Google Scholar
Ranzato, M., Poultney, C., Chopra, S., LeCun, Y.: Efficient learning of sparse representations with an energy-based model. In: Conference on Neural Information Processing Systems (NIPS) (2006)
Google Scholar
Rastegari, M., Diba, A., Parikh, D., Farhadi, A.: Multi-attribute queries: to merge or not to merge? In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
Google Scholar
Rogers, T.T., McClelland, J.L.: Semantic Cognition: A Parallel Distributed Processing Approach. A Parallel Distributed Processing Approach. The MIT Press (2004)
Google Scholar
Rogers, T.T., Lambon Ralph, M.A., Garrard, P., Bozeat, S., McClelland, J.L., Hodges, J.R., Patterson, K.: Structure and deterioration of semantic memory: a neuropsychological and computational investigation. Psychol. Rev. 111(1), 205–235 (2004)
Google Scholar
Roller, S., Schulte im Walde, S.: A Multimodal LDA model integrating textual, cognitive and visual modalities. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (2013)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1: Foundations, pp. 318–362. The MIT Press (1986)
Google Scholar
Russakovsky, O., Fei-Fei, L.: Attribute learning in large-scale datasets. In: ECCV International Workshop on Parts and Attributes (2010)
Google Scholar
Russell, B., Torralba, A., Murphy, K., Freeman, W.: LabelMe: a database and web-based tool for image annotation. Int. J. Comput. Vis. (IJCV) 77, 157–173 (2008)
Article Google Scholar
Salton, G., McGill, M.J.: Introduction to modern information retrieval. McGraw-Hill, Inc. (1986)
Google Scholar
Silberer, C.: Learning Visually Grounded Meaning Representations. Ph.D. thesis, Institute for Language, Cognition and Computation, School of Informatics, The University of Edinburgh (2015)
Google Scholar
Silberer, C., Lapata, M.: Grounded models of semantic representation. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (2012)
Google Scholar
Sloman, S.A., Love, B.C., Ahn, W.-K.: Feature centrality and conceptual coherence. Cogn. Sci. 22(2), 189–228 (1998)
Article Google Scholar
Smith, E.E., Shoben, E.J., Rips, L.J.: Structure and process in semantic memory: a featural model for semantic decisions. Psychol. Rev. 81(3), 214–241 (1974)
Article Google Scholar
Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., and Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2011)
Google Scholar
Socher, R., Karpathy, A., Le, Q.V., Manning, C., Ng, A.: Grounded compositional semantics for finding and describing images with sentences. Trans. Assoc. Comput. Linguist. 2, 207–218 (2014)
Google Scholar
Sohn, K., Shang, W., Lee, H.: Improved multimodal deep learning with variation of information. In: Conference on Neural Information Processing Systems (NIPS) (2014)
Google Scholar
Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep Boltzmann machines. In: Conference on Neural Information Processing Systems (NIPS) (2012)
Google Scholar
Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep Boltzmann machines. J. Mach. Learn. Res. (JMLR) 15, 2949–2980 (2014)
MathSciNet MATH Google Scholar
Szumlanski, S., Gomez, F., Sims, V.K.: A new set of norms for semantic relatedness measures. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (2013)
Google Scholar
Taylor, K.I., Devereux, B.J., Acres, K., Randall, B., Tyler, L.K.: Contrasting effects of feature-based statistics on the categorisation and basic-level identification of visual objects. Cognition 122(3), 363–374 (2012)
Article Google Scholar
Turney, P.D., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37(1), 141–188 (2010)
MathSciNet MATH Google Scholar
Tyler, L.K., Moss, H.E.: Towards a distributed account of conceptual knowledge. TRENDS Cogn. Sci. 5(6), 244–252 (2001)
Article Google Scholar
Vanpaemel, W., Storms, G., Ons, B.: A varying abstraction model for categorization. In: Proceedings of the 27th Annual Conference of the Cognitive Science Society (2005)
Google Scholar
Varma, M., Zisserman, A.: A statistical approach to texture classification from single images. Int. J. Comput. Vis. (IJCV) (Special Issue on Texture Analysis and Synthesis) 62(1–2), pp. 61–81 (2005)
Google Scholar
Vigliocco, G., Vinson, D.P., Lewis, W., Garrett, M.F.: Representing the meanings of object and action words: the featural and unitary semantic space hypothesis. Cogn. Psychol. 48(4), 422–488 (2004)
Article Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.-A.: Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning (ICML) (2008)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. (JMLR) 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Vinson, D.P., Vigliocco, G.: Semantic feature production norms for a large set of objects and events. Behav. Res. Methods 40(1), 183–190 (2008)
Article Google Scholar
von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Conference on Human Factors in Computing Systems (2004)
Google Scholar
Voorspoels, W., Vanpaemel, W., Storms, G.: Exemplars and prototypes in natural language concepts: a typicality-based evaluation. Psychon. Bull. Rev. 15, 630–637 (2008)
Article Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology (2011)
Google Scholar
Westermann, G., Mareschal, D.: From perceptual to language-mediated categorization. Philos. Trans. R Soc. B: Biol. Sci. 369(1634), 20120391 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Language, Cognitition and Computation, School of Informatics, University of Edinburgh, Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, UK
Carina Silberer

Authors

Carina Silberer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Carina Silberer .

Editor information

Editors and Affiliations

IBM T.J. Watson Research Center, Yorktown Heights, New York, USA
Rogerio Schmidt Feris
IST Austria Computer Vision and Machine Learning, Klosterneuburg, Austria
Christoph Lampert
Virginia Tech Electrical and Computer Engineering, Blacksburg, Virginia, USA
Devi Parikh

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Silberer, C. (2017). Grounding the Meaning of Words with Visual Attributes. In: Feris, R., Lampert, C., Parikh, D. (eds) Visual Attributes. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-50077-5_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-50077-5_13
Published: 22 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50075-1
Online ISBN: 978-3-319-50077-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics