On the Uses of Word Sense Change for Research in the Digital Humanities

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10450)

Abstract

With advances in technology and culture, our language changes. We invent new words, add or change meanings of existing words and change names of existing things. Unfortunately, our language does not carry a memory; words, expressions and meanings used in the past are forgotten over time. When searching and interpreting content from archives, language changes pose a great challenge. In this paper, we present results of automatic word sense change detection and show the utility for archive users as well as digital humanities’ research. Our method is able to capture changes that relate to the usage and culture of a word that cannot easily be found using dictionaries or other resources.

Notes

Acknowledgments

This work has been funded in parts by the project “Towards a knowledge-based culturomics” supported by a framework grant from the Swedish Research Council (2012–2016; dnr 2012-5738). This work is also in parts funded by the European Research Council under Alexandria (ERC 339233) and the European Community’s H2020 Program under SoBigData (RIA 654024). We would like to thank Times Newspapers Limited for providing the archive of The Times for our research.

References

  1. 1.
    Basile, P., Caputo, A., Luisi, R., Semeraro, G.: Diachronic analysis of the italian language exploiting google Ngram. In: Proceedings of Third Italian Conference on Computational Linguistics (CLiC-it 2016) (2016)Google Scholar
  2. 2.
    Cook, P., Lau, J.H., McCarthy, D., Baldwin, T.: Novel word-sense identification. In: Proceedings of COLING 2014, Dublin, Ireland, pp. 1624–1635, August 2014. http://www.aclweb.org/anthology/C14-1154
  3. 3.
    Cooper, M.C.: A mathematical model of historical semantics and the grouping of word meanings into concepts. Comput. Linguist. 32(2), 227–248 (2005)CrossRefMATHGoogle Scholar
  4. 4.
    Dejica, D., Hansen, G., Sandrini, P., Para, I.: Language in the Digital Era. Challenges and Perspectives. De Gruyter, Berlin (2016)CrossRefGoogle Scholar
  5. 5.
    Dorow, B., Eckmann, J.P., Sergi, D.: Using curvature and markov clustering in graphs for lexical acquisition and word sense discrimination. In: Proceedings of the Workshop MEANING-2005 (2005)Google Scholar
  6. 6.
    Frermann, L., Lapata, M.: A bayesian model of diachronic meaning change. TACL 4, 31–45 (2016)Google Scholar
  7. 7.
    Gulordava, K., Baroni, M.: A distributional similarity approach to the detection of semantic change in the Google Books Ngram corpus. In: Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics, GEMS 2011, pp. 67–71. Association for Computational Linguistics (2011)Google Scholar
  8. 8.
    Hamilton, W.L., Leskovec, J., Jurafsky, D.: Cultural shift or linguistic drift? comparing two computational measures of semantic change. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2016)Google Scholar
  9. 9.
    Hamilton, W.L., Leskovec, J., Jurafsky, D.: Diachronic word embeddings reveal statistical laws of semantic change. CoRR abs/1605.09096 (2016)Google Scholar
  10. 10.
    Kim, Y., Chiu, Y.I., Hanaki, K., Hegde, D., Petrov, S.: Temporal analysis of language through neural language models. In: Workshop on Language Technologies and Computational Social Science (2014)Google Scholar
  11. 11.
    Kulkarni, V., Al-Rfou, R., Perozzi, B., Skiena, S.: Statistically significant detection of linguistic change. In: Proceedings of the 24th International Conference on World Wide Web, pp. 625–635. ACM (2015)Google Scholar
  12. 12.
    Lau, J.H., Cook, P., McCarthy, D., Newman, D., Baldwin, T.: Word sense induction for novel sense detection. In: EACL 2012, 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 591–601 (2012). http://aclweb.org/anthology-new/E/E12/E12-1060.pdf
  13. 13.
    Miller, G.A.: WordNet: a lexical database for english. Commun. ACM 38, 39–41 (1995)CrossRefGoogle Scholar
  14. 14.
    Mitra, S., Mitra, R., Maity, S.K., Riedl, M., Biemann, C., Goyal, P., Mukherjee, A.: An automatic approach to identify word sense changes in text media across timescales. Nat. Lang. Eng. 21(05), 773–798 (2015)CrossRefGoogle Scholar
  15. 15.
    Mitra, S., Mitra, R., Riedl, M., Biemann, C., Mukherjee, A., Goyal, P.: That’s sick dude!: automatic identification of word sense change across different timescales. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 USA, pp. 1020–1029 (2014). http://aclweb.org/anthology/P/P14/P14-1096.pdf
  16. 16.
    Tahmasebi, N., Risse, T.: Word Sense Change Test Set (2017). https://doi.org/10.5281/zenodo.495572
  17. 17.
  18. 18.
    Roslin Bennett, A.: The Telephone Systems of the Continent of Europe. Longmans Green and CO., London (1895). http://archive.org/stream/telephonesystems00bennrich#page/332/ Google Scholar
  19. 19.
    Sagi, E., Kaufmann, S., Clark, B.: Semantic density analysis: comparing word meaning across time and phonetic space. In: Proceedings of the Workshop on Geometrical Models of Natural Language Semantics, GEMS 2009, pp. 104–111. ACL (2009). http://dl.acm.org/citation.cfm?id=1705415.1705429
  20. 20.
    Tahmasebi, N., Niklas, K., Zenz, G., Risse, T.: On the applicability of word sense discrimination on 201 years of modern english. Int. J. Dig. Libr. 13(3–4), 135–153 (2013). doi:10.1007/s00799-013-0105-8 CrossRefGoogle Scholar
  21. 21.
    Tahmasebi, N.N.: Models and algorithms for automatic detection of language evolution. Ph.D. thesis, Gottfried Wilhelm Leibniz Universitt Hannover (2013). http://edok01.tib.uni-hannover.de/edoks/e01dh13/771705034.pdf
  22. 22.
    Viklund, J., Borin, L.: How can big data help us study rhetorical history? In: Clarin Annual Conference (2016)Google Scholar
  23. 23.
    Wang, J., Bansal, M., Gimpel, K., Ziebart, B.D., Clement, T.Y.: A sense-topic model for word sense induction with unsupervised data enrichment. TACL 3, 59–71 (2015)Google Scholar
  24. 24.
    Wang, X., McCallum, A.: Topics over time: a non-markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, USA, pp. 424–433. ACM (2006)Google Scholar
  25. 25.
    Wijaya, D.T., Yeniterzi, R.: Understanding semantic change of words over centuries. In: Proceedings of the 2011 International Workshop on DETecting and Exploiting Cultural diversiTy on the Social Web, DETECT 2011, pp. 35–40. ACM, New York (2011)Google Scholar
  26. 26.
    Zhang, Y., Jatowt, A., Tanaka, K.: Detecting evolution of concepts based on cause-effect relationships in online reviews. In: Proceedings of the 25th International Conference on World Wide Web, pp. 649–660. ACM (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.SpråkbankenUniversity of GothenburgGothenburgSweden
  2. 2.University Library J.C. SenckenbergFrankfurtGermany

Personalised recommendations