Diachronic Deviation Features in Continuous Space Word Representations

  • Ni Sun
  • Tongfei Chen
  • Liumingjing Xiao
  • Junfeng Hu
Conference paper

DOI: 10.1007/978-3-319-12277-9_3

Part of the Lecture Notes in Computer Science book series (LNCS, volume 8801)
Cite this paper as:
Sun N., Chen T., Xiao L., Hu J. (2014) Diachronic Deviation Features in Continuous Space Word Representations. In: Sun M., Liu Y., Zhao J. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. Lecture Notes in Computer Science, vol 8801. Springer, Cham

Abstract

In distributed word representation, each word is represented as a unique point in the vector space. This paper extends this to a diachronic setting, where multiple word embeddings are generated with corpora in different time periods. These multiple embeddings can be mapped to a single target space via a linear transformation. In this target space each word is thus represented as a distribution. The deviation features of this distribution can reflect the semantic variation of words through different time periods. Experiments show that word groups with similar deviation features can indicate the hot topics in different ages. And the frequency change of these word groups can be used to detect the age of peak celebrity of the topics in the history.

Keywords

Lexical semantics diachronic corpora semantic distribution hot topics 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Ni Sun
    • 1
    • 2
  • Tongfei Chen
    • 1
  • Liumingjing Xiao
    • 1
  • Junfeng Hu
    • 1
    • 2
  1. 1.School of Electronics Engineering & Computer SciencePeking UniversityBeijingP.R. China
  2. 2.Key Laboratory of Computational Linguistics (Ministry of Education)P.R. China

Personalised recommendations