Advertisement

Named Entity Recognition in Twitter Using Images and Text

  • Diego Esteves
  • Rafael Peres
  • Jens Lehmann
  • Giulio Napolitano
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10544)

Abstract

Named Entity Recognition (NER) is an important subtask of information extraction that seeks to locate and recognise named entities. Despite recent achievements, we still face limitations with correctly detecting and classifying entities, prominently in short and noisy text, such as Twitter. An important negative aspect in most of NER approaches is the high dependency on hand-crafted features and domain-specific knowledge, necessary to achieve state-of-the-art results. Thus, devising models to deal with such linguistically complex contexts is still challenging. In this paper, we propose a novel multi-level architecture that does not rely on any specific linguistic resource or encoded rule. Unlike traditional approaches, we use features extracted from images and text to classify named entities. Experimental tests against state-of-the-art NER for Twitter on the Ritter dataset present competitive results (0.59 F-measure), indicating that this approach may lead towards better NER models.

Keywords

NER Short texts Noisy data Machine learning Computer vision 

Notes

Acknowledgments

This research was supported in part by an EU H2020 grant provided for the HOBBIT project (GA no. 688227) and CAPES Foundation (BEX 10179135).

References

  1. 1.
    Al-Rfou, R., Kulkarni, V., Perozzi, B., Skiena, S.: Polyglot-NER: massive multilingual named entity recognition. In: Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, British Columbia, Canada. SIAM (2015)Google Scholar
  2. 2.
    Basave, A.E.C., Varga, A., Rowe, M., Stankovic, M., Dadzie, A.-S.: Making sense of microposts (#msm2013) concept extraction challenge. In: Cano, A.E., Rowe, M., Stankovic, M., Dadzie, A.-S. (eds.) CEUR Workshop Proceedings, #MSM, vol. 1019, pp. 1–15. CEUR-WS.org (2013)Google Scholar
  3. 3.
    Bontcheva, K., Derczynski, L., Funk, A., Greenwood, M.A., Maynard, D., Aswani, N.: Twitie: an open-source information extraction pipeline for microblog text. In: RANLP, pp. 83–90 (2013)Google Scholar
  4. 4.
    Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. arXiv preprint arXiv:1511.08308 (2015)
  5. 5.
    Derczynski, L., Maynard, D., Rizzo, G., van Erp, M., Gorrell, G., Troncy, R., Petrak, J., Bontcheva, K.: Analysis of named entity recognition and linking for tweets. Inf. Process. Manage. 51(2), 32–49 (2015)CrossRefGoogle Scholar
  6. 6.
    Etter, D., Ferraro, F., Cotterell, R., Buzek, O., Van Durme, B. Nerit: named entity recognition for informal text. The Johns Hopkins University, The Human Language Technology Center of Excellence, HLTCOE, 810 Wyman Park Drive, Baltimore, Maryland 21211, Technical report (2013)Google Scholar
  7. 7.
    Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput. Vis. Image Underst. 106(1), 59–70 (2007)CrossRefGoogle Scholar
  8. 8.
    Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 524–531. IEEE (2005)Google Scholar
  9. 9.
    Fletcher, T.: Support vector machines explained (2009). http://sutikno.blog.undip.ac.id/files/2011/11/SVM-Explained.pdf. Accessed 6 June 2013
  10. 10.
    Gattani, A., Lamba, D.S., Garera, N., Tiwari, M., Chai, X., Das, S., Subramaniam, S., Rajaraman, A., Harinarayan, V., Doan, A.: Entity extraction, linking, classification, and tagging for social media: a wikipedia-based approach. Proc. VLDB Endow. 6(11), 1126–1137 (2013)CrossRefGoogle Scholar
  11. 11.
    Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
  12. 12.
    Liu, X., Zhou, M., Wei, F., Fu, Z., Zhou, X.: Joint inference of named entity recognition and normalization for tweets. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 526–535. Association for Computational Linguistics (2012)Google Scholar
  13. 13.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision 1999, vol. 2, pp. 1150–1157 (1999)Google Scholar
  14. 14.
    MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967)Google Scholar
  15. 15.
    Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)CrossRefGoogle Scholar
  16. 16.
    Tursun, O., Sinan, K.: A challenging big dataset for benchmarking trademark retrieval. In: IAPR Conference on Machine Vision and Applications (2015)Google Scholar
  17. 17.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)Google Scholar
  18. 18.
    Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, pp. 147–155. Association for Computational Linguistics (2009)Google Scholar
  19. 19.
    Ritter, A., Clark, S., Etzioni, O., et al.: Named entity recognition in tweets: an experimental study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534. Association for Computational Linguistics (2011)Google Scholar
  20. 20.
    Roberts, A., Gaizauskas, R.J., Hepple, M., Guo, Y.: Combining terminology resources and statistical methods for entity recognition: an evaluation. In: LREC (2008)Google Scholar
  21. 21.
    Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceedings Ninth IEEE International Conference on Computer Vision 2003, pp. 1470–1477. IEEE (2003)Google Scholar
  22. 22.
    Van Erp, M., Rizzo, G., Troncy, R.: Learning with the web: spotting named entities on the intersection of NERD and machine learning. In: #MSM, pp. 27–30 (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.University of BonnBonnGermany
  2. 2.Federal University of Rio de JaneiroRio de JaneiroBrazil
  3. 3.Fraunhofer IAISSankt AugustinGermany

Personalised recommendations