Skip to main content

Improving Object and Event Monitoring on Twitter Through Lexical Analysis and User Profiling

  • Conference paper
  • First Online:
Web Information Systems Engineering – WISE 2016 (WISE 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10042))

Included in the following conference series:

Abstract

Personal users on Twitter frequently post observations about their immediate environment as part of the 500 million tweets posted everyday. These observations and their implicitly associated time and location data are a valuable source of information for monitoring objects and events, such as earthquake, hailstorm, and shooting incidents. However, given the informal and uncertain expressions used in personal Twitter messages, and the various type of accounts existing on Twitter, capturing personal observations of objects and events is challenging. In contrast to the existing supervised approaches, which require significant efforts for annotating examples, in this paper, we propose an unsupervised approach for filtering personal observations. Our approach employs lexical analysis, user profiling and classification components to significantly improve filtering precision. To identify personal accounts, we define and compute a mean user profile for a dataset and employ distance metrics to evaluate the similarity of the user profiles under analysis to the mean. Our extensive experiments with real Twitter data show that our approach consistently improves filtering precision of personal observations by around 22 %.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://about.twitter.com/company.

  2. 2.

    https://dev.twitter.com/streaming/public.

  3. 3.

    https://cran.r-project.org/package=e1071.

  4. 4.

    http://crisislex.org/.

  5. 5.

    http://www.crowdflower.com/.

References

  1. Carroll, T.Z.J.: Unsupervised classification of sentiment and objectivity in Chinese text. In: Third International Joint Conference on Natural Language Processing, p. 304 (2008)

    Google Scholar 

  2. Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International World Wide Web Conference, pp. 675–684 (2011)

    Google Scholar 

  3. Chung, D.S., Nah, S.: Media credibility and journalistic role conceptions: views on citizen and professional journalists among citizen contributors. J. Mass Media Ethics 28(4), 271–288 (2013)

    Article  Google Scholar 

  4. Kennedy, J.: Particle swarm optimization. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 760–766. Springer, Heidelberg (2010)

    Google Scholar 

  5. Kwon, S., Cha, M., Jung, K., Chen, W., Wang, Y.: Prominent features of rumor propagation in online social media. In: Proceedings of 13th International Conference on Data Mining, pp. 1103–1108 (2013)

    Google Scholar 

  6. Li, R., Lei, K.H., Khadiwala, R., Chang, K.-C.: TEDAS: a Twitter-based event detection and analysis system. In: Proceedings of 28th International Conference on Data Engineering, pp. 1273–1276 (2012)

    Google Scholar 

  7. Lingad, J., Karimi, S., Yin, J.: Location extraction from disaster-related microblogs. In: Proceedings of the 22nd International World Wide Web Conference Companion, pp. 1017–1020 (2013)

    Google Scholar 

  8. Maddock, J., Starbird, K., Al-Hassani, H., Sandoval, D.E., Orand, M., Mason, R.M.: Characterizing online rumoring behavior using multi-dimensional signatures. In: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work and Social Computing, pp. 228–241 (2015)

    Google Scholar 

  9. Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)

    Google Scholar 

  10. Mukherjee, S., Weikum, G., Danescu-Niculescu-Mizil, C.: People on drugs: credibility of user statements in health communities. In: Proceedings of the 20th ACM International Conference on Knowledge Discovery and Data Mining, pp. 65–74 (2014)

    Google Scholar 

  11. Olteanu, A., Castillo, C., Diaz, F., Vieweg, S.: CrisisLex: a lexicon for collecting and filtering microblogged communications in crises. In: Proceedings of the 8th International AAAI Conference on Weblogs and Social Media, pp. 376–385 (2014)

    Google Scholar 

  12. Sakaki, T., Okazaki, M., Matsuo, Y.: Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Trans. Knowl. Data Eng. 25(4), 919–931 (2013)

    Article  Google Scholar 

  13. Santorini, B.: Part-of-speech tagging guidelines for the penn treebank project (3rd revision). Technical report MS-CIS-90-47, University of Pennsylvania Department of Computer and Information Science Technical (1990)

    Google Scholar 

  14. Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short text classification in Twitter to improve information filtering. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 841–842 (2010)

    Google Scholar 

  15. Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29(1), 24–54 (2010)

    Article  Google Scholar 

  16. Unankard, S., Li, X., Sharaf, M., Zhong, J., Li, X.: Predicting elections from social networks based on sub-event detection and sentiment analysis. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds.) WISE 2014. LNCS, vol. 8787, pp. 1–16. Springer, Heidelberg (2014). doi:10.1007/978-3-319-11746-1_1

    Chapter  Google Scholar 

  17. Unankard, S., Li, X., Sharaf, M.A.: Emerging event detection in social networks with location sensitivity. World Wide Web Journal (2015, in press)

    Google Scholar 

  18. Wu, S., Hofman, J.M., Mason, W.A., Watts, D.J.: Who says what to whom on Twitter. In: Proceedings of the 20th International World Wide Web Conference, pp. 705–714 (2011)

    Google Scholar 

  19. Zhang, Y., Szabo, C., Sheng, Q.Z.: Sense and focus: towards effective location inference and event detection on Twitter. In: The Proceedings of the 16th International Conference on Web Information Systems Engineering (2015)

    Google Scholar 

  20. Zhang, Y., Szabo, C., Sheng, Q.Z., Fang, X.S.: Classifying perspectives on Twitter: immediate observation, affection, and speculation. In: The Proceedings of the 16th International Conference on Web Information Systems Engineering (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yihong Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Zhang, Y., Szabo, C., Sheng, Q.Z. (2016). Improving Object and Event Monitoring on Twitter Through Lexical Analysis and User Profiling. In: Cellary, W., Mokbel, M., Wang, J., Wang, H., Zhou, R., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2016. WISE 2016. Lecture Notes in Computer Science(), vol 10042. Springer, Cham. https://doi.org/10.1007/978-3-319-48743-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48743-4_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48742-7

  • Online ISBN: 978-3-319-48743-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics