EmojiNet: Building a Machine Readable Sense Inventory for Emoji

  • Sanjaya Wijeratne
  • Lakshika Balasuriya
  • Amit Sheth
  • Derek Doran
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10046)

Abstract

Emoji are a contemporary and extremely popular way to enhance electronic communication. Without rigid semantics attached to them, emoji symbols take on different meanings based on the context of a message. Thus, like the word sense disambiguation task in natural language processing, machines also need to disambiguate the meaning or ‘sense’ of an emoji. In a first step toward achieving this goal, this paper presents EmojiNet, the first machine readable sense inventory for emoji. EmojiNet is a resource enabling systems to link emoji with their context-specific meaning. It is automatically constructed by integrating multiple emoji resources with BabelNet, which is the most comprehensive multilingual sense inventory available to date. The paper discusses its construction, evaluates the automatic resource creation process, and presents a use case where EmojiNet disambiguates emoji usage in tweets. EmojiNet is available online for use at http://emojinet.knoesis.org.

Keywords

EmojiNet Emoji analysis Emoji sense disambiguation 

Notes

Acknowledgments

We are grateful to Sujan Perera for thought-provoking discussions on the topic. We acknowledge partial support from the National Institute on Drug Abuse (NIDA) Grant No. 5R01DA039454-02: “Trending: Social Media Analysis to Monitor Cannabis and Synthetic Cannabinoid Use”, National Institutes of Health (NIH) award: MH105384-01A1: “Modeling Social Behavior for Healthcare Utilization in Depression”, and Grant No. 2014-PS-PSN-00006 awarded by the Bureau of Justice Assistance. The Bureau of Justice Assistance is a component of the U.S. Department of Justice’s Office of Justice Programs, which also includes the Bureau of Justice Statistics, the National Institute of Justice, the Office of Juvenile Justice and Delinquency Prevention, the Office for Victims of Crime, and the SMART Office. Points of view or opinions in this document are those of the authors and do not necessarily represent the official position or policies of the U.S. Department of Justice, NIH or NIDA.

References

  1. 1.
    Emogi research team - 2015 emoji report (2015)Google Scholar
  2. 2.
    Balasuriya, L., Wijeratne, S., Doran, D., Sheth, A.: Finding street gang members on Twitter. In: The 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2016), vol. 8, San Francisco, CA, USA, pp. 685–692 (08 2016)Google Scholar
  3. 3.
    Basile, P., Caputo, A., Semeraro, G.: An enhanced lesk word sense disambiguation algorithm through a distributional semantic model. In: COLING, pp. 1591–1600 (2014)Google Scholar
  4. 4.
    Camacho-Collados, J., Pilehvar, M.T., Navigli, R.: Nasari: a novel approach to a semantically-aware representation of items. In: Proceedings of NAACL, pp. 567–577 (2015)Google Scholar
  5. 5.
    Davis, M., Edberg, P.: Unicode emoji - unicode technical report #51. Technical report 51(3) (2016)Google Scholar
  6. 6.
    Dimson, T.: Emojineering part 1: machine learning for emoji trends. Instagram Engineering Blog (2015)Google Scholar
  7. 7.
    Kelly, R., Watts, L.: Characterising the inventive appropriation of emoji as relationally meaningful in mediated close personal relationships. Experiences of Technology Appropriation: Unanticipated Users, Usage, Circumstances, and Design (2015)Google Scholar
  8. 8.
    Miller, H., Thebault-Spieker, J., Chang, S., Johnson, I., Terveen, L., Hecht, B.: Blissfully happy or ready to fight: varying interpretations of emoji. In: ICWSM 2016 (2016)Google Scholar
  9. 9.
    Moro, A., Navigli, R., Tucci, F.M., Passonneau, R.J.: Annotating the MASC corpus with BabelNet. In: LREC, pp. 4214–4219 (2014)Google Scholar
  10. 10.
    Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Novak, P.K., Smailović, J., Sluban, B., Mozetič, I.: Sentiment of emojis. PLOS One 10(12), e0144296 (2015)CrossRefGoogle Scholar
  12. 12.
    Pavalanathan, U., Eisenstein, J.: Emoticons vs. emojis on Twitter: a causal inference approach. arXiv preprint arXiv:1510.08480 (2015)
  13. 13.
    Rezabek, L., Cochenour, J.: Visual cues in computer-mediated communication: supplementing text with emoticons. J. Vis. Lit. 18(2), 201–215 (1998)CrossRefGoogle Scholar
  14. 14.
    Santos, R.: Java image processing cookbook (2010). http://www.lac.inpe.br/JIPCookbook
  15. 15.
    SwiftKey, P.: Most-used emoji revealed: Americans love skulls, Brazilians love cats, the French love hearts [blog] (2015). http://bit.ly/2c5biPU
  16. 16.
    Vasilescu, F., Langlais, P., Lapalme, G.: Evaluating variants of the lesk approach for disambiguating words. In: LREC (2004)Google Scholar
  17. 17.
    Wang, W., Chen, L., Thirunarayan, K., Sheth, A.P.: Harnessing Twitter big data for automatic emotion identification. In: 2012 International Conference on Privacy, Security, Risk and Trust (PASSAT), 2012 International Conference on Social Computing (SocialCom), pp. 587–592. IEEE (2012)Google Scholar
  18. 18.
    Wijeratne, S., Balasuriya, L., Doran, D., Sheth, A.: Word embeddings to enhance Twitter gang member profile identification. In: IJCAI Workshop on Semantic Machine Learning (SML 2016), vol. 07, pp. 18–24. CEUR-WS, New York City (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Sanjaya Wijeratne
    • 1
  • Lakshika Balasuriya
    • 1
  • Amit Sheth
    • 1
  • Derek Doran
    • 1
  1. 1.Kno.e.sis CenterWright State UniversityDaytonUSA

Personalised recommendations