Analysing Relevant Diseases from Iberian Tweets

  • Víctor M. Prieto
  • Sergio Matos
  • Manuel Álvarez
  • Fidel Cacheda
  • José Luís Oliveira
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 222)


The Internet constitutes a huge source of information that can be exploited by individuals in many different ways. With the increasing use of social networks and blogs, the Internet is now used not only as an information source but also to disseminate personal health information. In this paper we exploit the wealth of user-generated data, available through the micro-blogging service Twitter, to estimate and track the incidence of health conditions in society, specifically in Portugal and Spain. We present results for the acquisition of relevant tweets for a set of four different conditions (flu, depression, pregnancy and eating disorders) and for the binary classification of these tweets as relevant or not for each case. The results obtained, ranging in AUC from 0.7 to 0.87, are very promising and indicate that such approach provides a feasible solution for measuring and tracking the evolution of many health related aspects within the society.


Data mining classification social media detecting health conditions 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aramaki, E., Maskawa, S., Morita, M.: Twitter catches the flu: detecting influenza epidemics using Twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1568–1576. Association for Computational Linguistics (2011)Google Scholar
  2. 2.
    Bosley, J.C., Zhao, N.W., Hill, S., Shofer, F.S., Asch, D.A., Becker, L.B., Merchant, R.M.: Decoding twitter: Surveillance and trends for cardiac arrest and resuscitation communication (2012)Google Scholar
  3. 3.
    Chew, C., Eysenbach, G.: Pandemics in the age of Twitter: content analysis of Tweets during the 2009 H1N1 outbreak. PloS one 5(11), e14118 (2010)Google Scholar
  4. 4.
    Chunara, R., Andrews, J.R., Brownstein, J.S.: Social and News Media Enable Estimation of Epidemiological Patterns Early in the 2010 Haitian Cholera Outbreak. American Journal of Tropical Medicine and Hygiene 86(1), 39–45 (2012)CrossRefGoogle Scholar
  5. 5.
    Culotta, A.: Towards detecting influenza epidemics by analyzing Twitter messages. In: Proceedings of the First Workshop on Social Media Analytics, pp. 115–122. ACM (2010)Google Scholar
  6. 6.
    Culotta, A.: Detecting influenza outbreaks by analyzing Twitter messages, arXiv:1007.4748 [cs.IR] (2010)Google Scholar
  7. 7.
    Ginsberg, J., Mohebbi, M.H., Patel, R.S., Brammer, L., Smolinski, M.S., Brilliant, L.: Detecting influenza epidemics using search engine query data. Nature 457, 1012–1014 (2009)CrossRefGoogle Scholar
  8. 8.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009)CrossRefGoogle Scholar
  9. 9.
    Heaivilin, N., Gerbert, B., Page, J.E., Gibbs, J.L.: Public health surveillance of dental pain via Twitter. Journal of Dental Research 90(9), 1047–1051 (2011)CrossRefGoogle Scholar
  10. 10.
    Lampos, V., Cristianini, N.: Tracking the flu pandemic by monitoring the social web. In: 2010 2nd International Workshop on Cognitive Information Processing (CIP), pp. 411–416 (2010)Google Scholar
  11. 11.
    Lyon, A., Nunn, M., Grossel, G., Burgman, M.: Comparison of web-based biosecurity intelligence systems: BioCaster, EpiSPIDER and HealthMap. Transboundary and Emerging Diseases 59(3), 223–232 (2012)CrossRefGoogle Scholar
  12. 12.
    Paul, M., Dredze, M.: You are what you tweet: Analyzing Twitter for public health. In: Proceedings of the 5th International AAAI Conference on Weblogs and Social Media, pp. 265–272 (2011)Google Scholar
  13. 13.
    Porter, M.F.: Snowball: A language for stemming algorithms. (published online, October 2001)Google Scholar
  14. 14.
    Santos, J.C., Matos, S.: Predicting Flu Incidence from Portuguese Tweets. In: Proceedings of IWBBIO 2013, Granada, Spain (March 2013)Google Scholar
  15. 15.
    Scanfeld, D., Scanfeld, V., Larson, E.L.: Dissemination of health information through social networks: twitter and antibiotics. American Journal of Infection Control 38(3), 182–188 (2010)CrossRefGoogle Scholar
  16. 16.
    Shuyo, N.: Language detection library for java (2012)Google Scholar
  17. 17.
    Signorini, A., Segre, A.M., Polgreen, P.M.: The use of Twitter to track levels of disease activity and public concern in the U.S. during the influenza A H1N1 pandemic. PloS One 6(5), e19467 (2011)Google Scholar
  18. 18.
    Twitter search api (2012), (online; accessed November 20, 2012)

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Víctor M. Prieto
    • 1
  • Sergio Matos
    • 2
  • Manuel Álvarez
    • 1
  • Fidel Cacheda
    • 1
  • José Luís Oliveira
    • 2
  1. 1.Department of Information and Communication TechnologiesUniversity of a CoruñaA CoruñaSpain
  2. 2.DETI/IEETAUniversity of AveiroAveiroPortugal

Personalised recommendations