Distributional Analysis of Polysemous Function Words

Padó, Sebastian; Hole, Daniel

doi:10.1007/978-3-030-98479-3_6

Sebastian Padó¹⁰ &
Daniel Hole¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13206))

Included in the following conference series:

International Tbilisi Symposium on Logic, Language, and Computation

379 Accesses

Abstract

In this paper, we are concerned with the phenomenon of function word polysemy. We adopt the framework of distributional semantics, which characterizes word meaning by observing occurrence contexts in large corpora and which is in principle well situated to model polysemy. Nevertheless, function words were traditionally considered as impossible to analyze distributionally due to their highly flexible usage patterns.

We establish that contextualized word embeddings, the most recent generation of distributional methods, offer hope in this regard. Using the German reflexive pronoun sich as an example, we find that contextualized word embeddings capture theoretically motivated word senses for sich to the extent to which these senses are mirrored systematically in linguistic usage.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We found a comparable, but slightly lower, performance for the sentential contexts in the classification experiments reported below. These experiments are part of the companion Jupyter notebook to this article.
2.
We also experimented with fine-tuning the embeddings, but did not obtain competitive results, presumably due to the small size of the training set.

References

Bannard, C., Baldwin, T.: Distributional models of preposition semantics. In: Proceedings of the ACL-SIGSEM Workshop on the Linguistic Dimensions of Prepositions and Their Use in Computational Linguistics Formalisms and Applications, Toulouse, France, pp. 169–180 (2003)
Google Scholar
Baroni, M., Bernardi, R., Do, N.Q., Shan, C.C.: Entailment above the word level in distributional semantics. In: Proceedings of EACL, Avignon, France, pp. 23–32 (2012)
Google Scholar
Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In: Proceedings of ACL, Baltimore, Maryland, pp. 238–247 (2014)
Google Scholar
Bernardi, R., Dinu, G., Marelli, M., Baroni, M.: A relatedness benchmark to test the role of determiners in compositional distributional semantics. In: Proceedings of ACL, Sofia, Bulgaria, pp. 53–57 (2013)
Google Scholar
Boleda, G., Schulte im Walde, S., Badia, T.: Modeling regular polysemy: a study on the semantic classification of Catalan adjectives. Comput. Linguist. 38(3), 575–616 (2012)
Google Scholar
Cimiano, P., Hotho, A., Staab, S.: Learning concept hierarchies from text corpora using formal concept analysis. J. Artif. Intell. Res. 24, 305–339 (2005)
Article Google Scholar
Deepset.AI: German BERT (2019). https://deepset.ai/german-bert
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL, Minneapolis, pp. 4171–4186 (2019)
Google Scholar
Faaß, G., Eckart, K.: SdeWaC – a corpus of parsable sentences from the web. In: Gurevych, I., Biemann, C., Zesch, T. (eds.) GSCL 2013. LNCS (LNAI), vol. 8105, pp. 61–68. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40722-2_6
Chapter Google Scholar
Firth, J.R.: Papers in linguistics 1934–1951. Oxford University Press, Oxford (1957)
Google Scholar
Gast, V., Haas, F.: On reciprocal and reflexive uses of anaphors in German and other European languages. In: König, E., Gast, V. (eds.) Reciprocals and Reflexives: Theoretical and Typological Explorations, pp. 307–346. Mouton de Gruyter, Hague (2008)
Google Scholar
Gupta, A., Boleda, G., Baroni, M., Padó, S.: Distributional vectors encode referential attributes. In: Proceedings of EMNLP. Lisbon, Portugal (2015)
Google Scholar
Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)
Google Scholar
Jawahar, G., Sagot, B., Seddah, D.: What does BERT learn about the structure of language? In: Proceedings of ACL. Florence, Italy, pp. 3651–3657 (2019)
Google Scholar
Kemmer, S.: The Middle Voice, Typological Studies in Language, vol. 23. John Benjamins, Amsterdam and Philadelphia (1991)
Google Scholar
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977). http://www.ncbi.nlm.nih.gov/pubmed/843571
Levy, O., Goldberg, Y.: Neural word embedding as implicit matrix factorization. In: Proceedings of NeurIPS. Montréal, QC, pp. 2177–2185. (2014)
Google Scholar
Schneider, N., et al.: Comprehensive supersense disambiguation of English prepositions and possessives. In: Proceedings of ACL, Melbourne, Australia, pp. 185–196 (2018)
Google Scholar
Turney, P.D., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37(1), 141–188 (2010)
Article MathSciNet Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Proceedings of NeurIPS, Long Beach, CA, pp. 5998–6008 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Natural Language Processing (IMS), University of Stuttgart, Stuttgart, Germany
Sebastian Padó
Linguistics (IL), University of Stuttgart, Stuttgart, Germany
Daniel Hole

Authors

Sebastian Padó
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Hole
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sebastian Padó .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Aybüke Özgün
Heinrich Heine University Düsseldorf, Düsseldorf, Germany
Yulia Zinova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Padó, S., Hole, D. (2022). Distributional Analysis of Polysemous Function Words. In: Özgün, A., Zinova, Y. (eds) Language, Logic, and Computation. TbiLLC 2019. Lecture Notes in Computer Science, vol 13206. Springer, Cham. https://doi.org/10.1007/978-3-030-98479-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-98479-3_6
Published: 31 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98478-6
Online ISBN: 978-3-030-98479-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the Association of Logic, Language and Information. (opens in a new tab)

Distributional Analysis of Polysemous Function Words