Abstract
Social media repositories serve as a significant source of evidence when extracting information related to the reputation of a particular entity (e.g., a particular politician, singer or company). Reputation management experts are in need of automated methods for mining the social media repositories (in particular Twitter) to monitor the reputation of a particular entity. A quite significant research challenge related to the above issue is to disambiguate tweets with respect to entity names. To address this issue in this paper we use “context phrases” in a tweet and Wikipedia disambiguated articles for a particular entity in a random forest classifier. Furthermore, we also utilize the concept of “relatedness” between tweet and entity using the Wikipedia category-article structure that captures the amount of discussion present inside a tweet related to an entity. The experimental evaluations show a significant improvement over the baseline and comparable performance with other systems representing strong performance given that we restrict ourselves to features extracted from Wikipedia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amigó, E., Carrillo de Albornoz, J., Chugur, I., Corujo, A., Gonzalo, J., Martín, T., Meij, E., de Rijke, M., Spina, D.: Overview of replab 2013: Evaluating online reputation monitoring systems. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds.) CLEF 2013. LNCS, vol. 8138, pp. 333–352. Springer, Heidelberg (2013)
Amigó, E., Gonzalo, J., Verdejo, F.: A General Evaluation Measure for Document Organization Tasks. In: Proceedings SIGIR (July 2013)
Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: Dbpedia - a crystallization point for the web of data. Web Semant 7(3), 154–165 (2009)
Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: EACL, vol. 6, pp. 9–16 (2006)
Dellarocas, C., Awad, N.F., Zhang, X.M.: Exploring the value of online reviews to organizations: Implications for revenue forecasting and planning. In: Management Science, pp. 1407–1424 (2003)
Ferragina, P., Scaiella, U.: Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In: CIKM 2010, pp. 1625–1628. ACM, New York (2010)
Han, X., Zhao, J.: Named entity disambiguation by leveraging wikipedia semantic knowledge. In: CIKM 2009, pp. 215–224. ACM, New York (2009)
Meij, E., Weerkamp, W., de Rijke, M.: Adding semantics to microblog posts. In: WSDM 2012, pp. 563–572. ACM, New York (2012)
Milne, D., Witten, I.H.: Learning to link with wikipedia. In: CIKM 2008, pp. 509–518. ACM (2008)
Peetz, M.-H., Spina, D., Gonzalo, J., de Rijke, M.: Towards an active learning system for company name disambiguation in microblog streams. In: CLEF (Online Working Notes/Labs/Workshop) (2013)
Qureshi, M.A., Younus, A., Abril, D., O’Riordan, C., Pasi, G.: Cirg irdisco at replab2013 filtering task: Use of wikipedia’s graph structure for entity name disambiguation in tweets. In: CLEF (Online Working Notes/Labs/Workshop) (2013)
Zesch, T., Gurevych, I.: Analysis of the Wikipedia Category Graph for NLP Applications. In: Proceedings of the TextGraphs-2 Workshop, NAACL-HLT (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Qureshi, M.A., O’Riordan, C., Pasi, G. (2014). Exploiting Wikipedia for Entity Name Disambiguation in Tweets. In: Métais, E., Roche, M., Teisseire, M. (eds) Natural Language Processing and Information Systems. NLDB 2014. Lecture Notes in Computer Science, vol 8455. Springer, Cham. https://doi.org/10.1007/978-3-319-07983-7_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-07983-7_25
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07982-0
Online ISBN: 978-3-319-07983-7
eBook Packages: Computer ScienceComputer Science (R0)