Advertisement

Estimating Time to Event of Future Events Based on Linguistic Cues on Twitter

  • Ali Hürriyetoǧlu
  • Nelleke Oostdijk
  • Antal van den Bosch
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 740)

Abstract

Given a stream of Twitter messages about an event, we investigate the predictive power of features generated from words and temporal expressions in the messages to estimate the time to event (TTE). From labeled training data average TTE values of the predictive features are learned, so that when they occur in an event-related tweet the TTE estimate can be provided for that tweet. We utilize temporal logic rules and a historical context integration function to improve the TTE estimation precision. In experiments on football matches and music concerts we show that the estimates of the method are off by 4 and 10 h in terms of mean absolute error on average, respectively. We find that the type and size of the event affect the estimation quality. An out-of-domain test on music concerts shows that models and hyperparameters trained and optimized on football matches can be used to estimate the remaining time to concerts. Moreover, mixing in concert events in training improves the precision of the average football event estimate.

Keywords

Smart city Social media analysis Natural language processing Time-to-event estimation Temporal expressions Skipgrams Football matches Music concerts 

Notes

Acknowledgements

This research was supported by the Dutch national programme COMMIT as part of the Infiniti project.

References

  1. 1.
    Baeza Yates, R.: Searching the future. In: In ACM SIGIR Workshop on Mathematical/Formal Methods for Information Retrieval (MF/IR 2005) (2005)Google Scholar
  2. 2.
    Baldwin, T., Cook, P., Lui, M., MacKinlay, A., Wang, L.: How noisy social media text, how diffrnt social media sources. In: Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP 2013), pp. 356–364 (2013)Google Scholar
  3. 3.
    Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. Newsl. 6(1), 20–29 (2004).  https://doi.org/10.1145/1007730.1007735. URL http://doi.acm.org/10.1145/1007730.1007735
  4. 4.
    Becker, H., Iter, D., Naaman, M., Gravano, L.: Identifying content for planned events across social media sites. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, WSDM ’12, pp. 533–542. ACM, New York, USA (2012).  https://doi.org/10.1145/2124295.2124360. URL http://doi.acm.org/10.1145/2124295.2124360
  5. 5.
    Blamey, B., Crick, T., Oatley, G.: ‘The first day of summer’: parsing temporal expressions with distributed semantics. In: Bramer, M., Petridis, M. (eds.) Research and Development in Intelligent Systems XXX, pp. 389–402. Springer International Publishing (2013).  https://doi.org/10.1007/978-3-319-02621-3_29. http://x.doi.org/10.1007/978-3-319-02621-3_29
  6. 6.
    Briscoe, E., Appling, S., Schlosser, J.: Passive crowd sourcing for technology prediction. In: Agarwal, N., Xu, K., Osgood, N. (eds.) Social Computing, Behavioral-Cultural Modeling, and Prediction. Lecture Notes in Computer Science, vol. 9021, pp. 264–269. Springer International Publishing (2015).  https://doi.org/10.1007/978-3-319-16268-3_28. http://dx.doi.org/10.1007/978-3-319-16268-3_28
  7. 7.
    Chang, A.X., Manning, C.D.: Sutime: a library for recognizing and normalizing time expressions. In: LREC (2012)Google Scholar
  8. 8.
    Cohen, M.J., Brink, G.J.M., Adang, O.M.J., Dijk, J.A.G.M., Boeschoten, T.: Twee werelden: You only live once. Technical report, Ministerie van Veiligheid en Justitie, The Hague, The Netherlands (2013)Google Scholar
  9. 9.
    Dias, G., Campos, R., Jorge, A.: Future retrieval: what does the future talk about? In: Proceedings SIGIR2011 Workshop on Enriching Information Retrieval (ENIR2011) (2011)Google Scholar
  10. 10.
    Hürriyetoğlu, A., Kunneman, F., van den Bosch, A.: Estimating the time between twitter messages and future events. In: DIR, pp. 20–23 (2013)Google Scholar
  11. 11.
    Hürriyetoğlu, A., Oostdijk, N., van den Bosch, A.: Estimating time to event from tweets using temporal expressions. In: Proceedings of the 5th Workshop on Language Analysis for Social Media (LASM), pp. 8–16. Association for Computational Linguistics, Gothenburg, Sweden (2014). http://www.aclweb.org/anthology/W14-1302
  12. 12.
    Jatowt, A., Au Yeung, C.m.: Extracting collective expectations about the future from large text collections. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM ’11, pp. 1259–1264. ACM, New York, USA (2011).  https://doi.org/10.1145/2063576.2063759. http://doi.acm.org/10.1145/2063576.2063759
  13. 13.
    Jatowt, A., Au Yeung, C.M., Tanaka, K.: Estimating document focus time. In: Proceedings of the 22Nd ACM International Conference on Conference on Information & Knowledge Management, CIKM ’13, pp. 2273–2278. ACM, New York, USA (2013).  https://doi.org/10.1145/2505515.2505655. http://doi.acm.org/10.1145/2505515.2505655
  14. 14.
    Kallus, N.: Predicting crowd behavior with big public data. In: Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, WWW Companion ’14, pp. 625–630. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2014).  https://doi.org/10.1145/2567948.2579233. http://dx.doi.org/10.1145/2567948.2579233
  15. 15.
    Kanhabua, N., Romano, S., Stewart, A.: Identifying relevant temporal expressions for real-world events. In: Proceedings of The SIGIR 2012 Workshop on Time-aware Information Access, Portland, OR (2012)Google Scholar
  16. 16.
    Kawai, H., Jatowt, A., Tanaka, K., Kunieda, K., Yamada, K.: Chronoseeker: search engine for future and past events. In: Proceedings of the 4th International Conference on Uniquitous Information Management and Communication, ICUIMC ’10, pp. 25:1–25:10. ACM, New York, USA (2010).  https://doi.org/10.1145/2108616.2108647. http://doi.acm.org/10.1145/2108616.2108647
  17. 17.
    Kunneman, F., Van den Bosch, A.: Leveraging unscheduled event prediction through mining scheduled event tweets. In: Roos, N., Winands, M., Uiterwijk, J. (eds.) Proceedings of the 24th Benelux Conference on Artficial Intelligence, pp. 147–154. Maastricht, The Netherlands (2012)Google Scholar
  18. 18.
    Lee, H., Surdeanu, M., MacCartney, B., Jurafsky, D.: On the importance of text analysis for stock price prediction. In: Proceedings of LREC 2014 (2014)Google Scholar
  19. 19.
    Mani, I., Wilson, G.: Robust temporal processing of news. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, ACL ’00, pp. 69–76. Association for Computational Linguistics, Stroudsburg, PA, USA (2000).  https://doi.org/10.3115/1075218.1075228. http://dx.doi.org/10.3115/1075218.1075228
  20. 20.
    Morency, P.: When temporal expressions don’t tell time: a pragmatic approach to temporality, argumentation and subjectivity (2006). https://www2.unine.ch/files/content/sites/cognition/files/shared/documents/patrickmorency-thesisproject.pdf
  21. 21.
    Muthiah, S.: Forecasting protests by detecting future time mentions in news and social media. Master’s thesis, Virginia Polytechnic Institute and State University (2014). http://vtechworks.lib.vt.edu/handle/10919/25430
  22. 22.
    Nakajima, Y., Ptaszynski, M., Honma, H., Masui, F.: Investigation of future reference expressions in trend information. In: 2014 AAAI Spring Symposium Series, pp. 32–38 (2014). http://www.aaai.org/ocs/index.php/SSS/SSS14/paper/view/7691
  23. 23.
    Nguyen-Son, H.Q., Hoang, A.T., Tran, M.T., Yoshiura, H., Sonehara, N., Echizen, I.: Anonymizing temporal phrases in natural language text to be posted on social networking services. In: Shi, Y.Q., Kim, H.J., Prez-Gonzlez, F. (eds.) Digital-Forensics and Watermarking. Lecture Notes in Computer Science, pp. 437–451. Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-662-43886-2_31. http://dx.doi.org/10.1007/978-3-662-43886-2_31
  24. 24.
    Noce, L., Zamberletti, A., Gallo, I., Piccoli, G., Rodriguez, J.: Automatic prediction of future business conditions. In: Przepirkowski, A., Ogrodniczuk, M. (eds.) Advances in Natural Language Processing. Lecture Notes in Computer Science, vol. 8686, pp. 371–383. Springer International Publishing (2014).  https://doi.org/10.1007/978-3-319-10888-9_37. http://dx.doi.org/10.1007/978-3-319-10888-9_37
  25. 25.
    Noro, T., Inui, T., Takamura, H., Okumura, M.: Time period identification of events in text. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, ACL-44, pp. 1153–1160. Association for Computational Linguistics, Stroudsburg, PA, USA (2006).  https://doi.org/10.3115/1220175.1220320. http://dx.doi.org/10.3115/1220175.1220320
  26. 26.
    Ozdikis, O., Senkul, P., Oguztuzun, H.: Semantic expansion of hashtags for enhanced event detection in twitter. In: Proceedings of the 1st International Workshop on Online Social Systems (2012)Google Scholar
  27. 27.
    Papacharalampous, A.E., Cats, O., Lankhaar, J.W., Daamen, W., Van Lint, H.: Multi-modal data fusion for big events. In: Transportation Research Board 95th Annual Meeting, 16-2267 (2016). https://trid.trb.org/view.aspx?id=1392844
  28. 28.
    Radinsky, K., Davidovich, S., Markovitch, S.: Learning causality for news events prediction. In: Proceedings of the 21st International Conference on World Wide Web, WWW ’12, pp. 909–918. ACM, New York, USA (2012).  https://doi.org/10.1145/2187836.2187958. http://dx.doi.org/10.1145/2187836.2187958
  29. 29.
    Ramakrishnan, N., Butler, P., Muthiah, S., Self, N., Khandpur, R., Saraf, P., Wang, W., Cadena, J., Vullikanti, A., Korkmaz, G., Kuhlman, C.J., Marathe, A., Zhao, L., Hua, T., Chen, F., Lu, C.T., Huang, B., Srinivasan, A., Trinh, K., Getoor, L., Katz, G., Doyle, A., Ackermann, C., Zavorin, I., Ford, J., Summers, K.M., Fayed, Y., Arredondo, J., Gupta, D., Mares, D.: ‘beating the news’ with embers: forecasting civil unrest using open source indicators. CoRR abs/1402.7035 (2014)Google Scholar
  30. 30.
    Redd, A., Carter, M., Divita, G., Shen, S., Palmer, M., Samore, M., Gundlapalli, A.V.: Detecting earlier indicators of homelessness in the free text of medical records. Stud. Health Technol. Inform. 202, 153–156 (2013)Google Scholar
  31. 31.
    Ritter, A., Mausam, Etzioni, O., Clark, S.: Open domain event extraction from twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’12, pp. 1104–1112. ACM, New York, USA (2012).  https://doi.org/10.1145/2339530.2339704. http://dx.doi.org/10.1145/2339530.2339704
  32. 32.
    Roitman, H., Mamou, J., Mehta, S., Satt, A., Subramaniam, L.: Harnessing the crowds for smart city sensing. In: Proceedings of the 1st International Workshop on Multimodal Crowd Sensing, CrowdSens ’12, pp. 17–18. ACM, New York, USA (2012).  https://doi.org/10.1145/2390034.2390043. http://doi.acm.org/10.1145/2390034.2390043
  33. 33.
    Strötgen, J., Alonso, O., Gertz, M.: Identification of top relevant temporal expressions in documents. In: Proceedings of the 2Nd Temporal Web Analytics Workshop, TempWeb ’12, pp. 33–40. ACM, New York, USA (2012).  https://doi.org/10.1145/2169095.2169102. http://doi.acm.org/10.1145/2169095.2169102
  34. 34.
    Strötgen, J., Gertz, M.: Multilingual and cross-domain temporal tagging. Language Resources and Evaluation 47(2), 269–298 (2013).  https://doi.org/10.1007/s10579-012-9179-y. http://dx.doi.org/10.1007/s10579-012-9179-y
  35. 35.
    Tjong Kim Sang, E., van den Bosch, A.: Dealing with big data: the case of twitter. Comput. Linguist. Netherlands J 3, 121–134 (2013)Google Scholar
  36. 36.
    Tops, H., van den Bosch, A., Kunneman, F.: Predicting time-to-event from twitter messages. In: BNAIC 2013 The 24th Benelux Conference on Artificial Intelligence, pp. 207–2014 (2013)Google Scholar
  37. 37.
    Tufekci, Z.: Big questions for social media big data: representativeness, validity and other methodological pitfalls. In: Adar, E., Resnick, P., Choudhury, M.D., Hogan, B., Oh, A. (eds.) Proceedings of the Eighth International Conference on Weblogs and Social Media, ICWSM 2014, Ann Arbor, Michigan, USA, 1–4 June 2014. The AAAI Press (2014). http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8062
  38. 38.
    Wang, X., Tokarchuk, L., Cuadrado, F., Poslad, S.: Exploiting hashtags for adaptive microblog crawling. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM ’13, pp. 311–315. ACM, New York, USA (2013).  https://doi.org/10.1145/2492517.2492624. http://doi.acm.org/10.1145/2492517.2492624
  39. 39.
    Weerkamp, W., De Rijke, M.: Activity prediction: A twitter-based exploration. In: Proceedings of the SIGIR 2012 Workshop on Time-aware Information Access, TAIA-2012 (2012)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Ali Hürriyetoǧlu
    • 1
  • Nelleke Oostdijk
    • 1
  • Antal van den Bosch
    • 1
  1. 1.Centre for Language StudiesRadboud UniversityNijmegenThe Netherlands

Personalised recommendations