MEDCollector: Multisource Epidemic Data Collector

  • João Zamite
  • Fabrício A. B. Silva
  • Francisco Couto
  • Mário J. Silva
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6990)


We present a novel approach for epidemic data collection and integration based on the principles of interoperability and modularity. Accurate and timely epidemic models require large, fresh datasets. The World Wide Web, due to its explosion in data availability, represents a valuable source for epidemiological datasets. From an e-science perspective, collected data can be shared across multiple applications to enable the creation of dynamic platforms to extract knowledge from these datasets. Our approach, MEDCollector, addresses this problem by enabling data collection from multiple sources and its upload to the repository of an epidemic research information platform. Enabling the flexible use and configuration of services through workflow definition, MEDCollector is adaptable to multiple Web sources. Identified disease and location entities are mapped to ontologies, not only guaranteeing the consistency within gathered datasets but also allowing the exploration of relations between the mapped entities. MEDCollector retrieves data from the web and enables its packaging for later use in epidemic modeling tools.


Epidemic Surveillance Data Collection Information Integration Workflow Design 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brownstein, J., Freifeld, C.: HealthMap: The development of automated real-time internet surveillance for epidemic intelligence. Euro. Surveill. 12(10), 71129 (2007)Google Scholar
  2. 2.
    Ginsberg, J., Mohebbi, M., Patel, R., Brammer, L., Smolinski, M., Brilliant, L.: Detecting influenza epidemics using search engine query data. Nature 457(7232), 1012–1014 (2008)CrossRefGoogle Scholar
  3. 3.
    Mawudeku, A., Blench, M.: Global Public Health Intelligence Network (GPHIN). In: 7th Conference of the Association for Machine Translation in the Americas, pp. 8–12 (2006)Google Scholar
  4. 4.
    Van Noort, S., Muehlen, M., Rebelo, A., Koppeschaar, C., Lima, L., Gomes, M.: Gripenet: an internet-based system to monitor influenza-like illness uniformly across Europe. Euro. Surveill. 12(7), 5 (2007)Google Scholar
  5. 5.
    Twitter, (accessed June, 2011)
  6. 6.
    Silva, M.J., Silva, F.A., Lopes, L.F., Couto, F.M.: Building a digital library for epidemic modelling. In: Proceedings of ICDL 2010, The International Conference on Digital Libraries, February 23-27, vol. 1, TERI Press, New Delhi (2010)Google Scholar
  7. 7.
    Zamite, J., Silva, F.A.B., Couto, F., Silva, M.J.: MEDCollector: Multisource epidemic data collector. In: Khuri, S., Lhotská, L., Pisanti, N. (eds.) ITBAM 2010. LNCS, vol. 6266, pp. 16–30. Springer, Heidelberg (2010), CrossRefGoogle Scholar
  8. 8.
    Noronha, N., Campos, J.P., Gomes, D., Silva, M.J., Borbinha, J.L.: A deposit for digital collections. In: Constantopoulos, P., Sølvberg, I.T. (eds.) ECDL 2001. LNCS, vol. 2163, pp. 200–212. Springer, Heidelberg (2001), CrossRefGoogle Scholar
  9. 9.
    Li, P., Castrillo, J., Velarde, G., Wassink, I., Soiland-Reyes, S., Owen, S., Withers, D., Oinn, T., Pocock, M., Goble, C., Oliver, S., Kell, D.: Performing statistical analyses on quantitative data in taverna workflows: an example using r and maxdbrowse to identify differentially-expressed genes from microarray data. BMC Bioinformatics 9(334) (August 2008)Google Scholar
  10. 10.
    Gibson, A., Gamble, M., Wolstencroft, K., Oinn, T., Goble, C.: The data playground: An intuitive workflow specification environment. In: IEEE International Conference on e-Science and Grid Computing, pp. 59–68 (2007)Google Scholar
  11. 11.
    Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurrency and Computation: Practice and Experience 18(10), 1039–1065 (2006)CrossRefGoogle Scholar
  12. 12.
    Riedel, M., Memon, A., Memon, M., Mallmann, D., Streit, A., Wolf, F., Lippert, T., Venturi, V., Andreetto, P., Marzolla, M., Ferraro, A., Ghiselli, A., Hedman, F., Shah, Z.A., Salzemann, J., Da Costa, A., Breton, V., Kasam, V., Hofmann-Apitius, M., Snelling, D., van de Berghe, S., Li, V., Brewer, S., Dunlop, A., De Silva, N.: Improving e-Science with Interoperability of the e-Infrastructures EGEE and DEISA. In: International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, pp. 225–231 (2008)Google Scholar
  13. 13.
    Madoff, L., Yu, V.: ProMED-mail: an early warning system for emerging diseases. Clinical Infectious Diseases 39(2), 227–232 (2004)CrossRefGoogle Scholar
  14. 14.
    European Center for Disease Prevention and Control (ECDC), (accessed June, 2011)
  15. 15.
    European Influenza Surveillance Network (EISN), (accessed June, 2011)
  16. 16.
    Marquet, R., Bartelds, A., van Noort, S., Koppeschaar, C., Paget, J., Schellevis, F., van der Zee, J.: Internet-based monitoring of influenza-like illness(ILI) in the general population of the Netherlands during the 2003 - 2004 influenza season. BMC Public Health 6(1), 242 (2006)CrossRefGoogle Scholar
  17. 17.
    Durvasula, S., Guttmann, M., Kumar, A., Lamb, J., Mitchell, T., Oral, B., Pai, Y., Sedlack, T., Sharma, H., Sundaresan, S.: SOA Practitioners Guide, Part 2, SOA Reference Architecture (2006)Google Scholar
  18. 18.
    Garlan, D.: Using service-oriented architectures for socio-cultural analysis,
  19. 19.
    Alves, A., Arkin, A., Askary, S., Bloch, B., Curbera, F., Goland, Y., Kartha, N., Sterling, König, D., Mehta, V., Thatte, S., van der Rijn, D., Yendluri, P., Yiu, A.: Web services business process execution language version 2.0. OASIS Committee Draft (May 2006)Google Scholar
  20. 20.
    Lopes, L.F., Zamite, J., Tavares, B., Couto, F., Silva, F., Silva, M.J.: Automated social network epidemic data collector. In: INForum - Simpósio de Informática (September 2009)Google Scholar
  21. 21.
    Bodenreider, O.: The unified medical language system (umls): integrating biomedical terminology. Nucl. Acids Res. 32(suppl. 1), D267–D270 (2004), CrossRefGoogle Scholar
  22. 22.
    GeoNames, (accessed June, 2011)
  23. 23.
    Decker, B.: World geodetic system 1984 (1986)Google Scholar
  24. 24.
    Miles, A., Matthews, B., Wilson, M., Brickley, D.: SKOS Core: Simple knowledge organisation for the web. DCMI 5, 1–9Google Scholar
  25. 25.
    Business process modeling notation (bpmn) version 1.2, Tech. Rep. (January 2009),
  26. 26.
    Aboauf, E.: WireIt - a Javascript Wiring Library, (accessed June, 2011)
  27. 27.
    Yahoo Pipes, (accessed June, 2011)
  28. 28.
    Sousa, J., Schmerl, B., Poladian, V., Brodsky, A.: uDesign: End-User Design Applied to Monitoring and Control Applications for Smart Spaces. In: Proceedings of the 2008 Working IFIP/IEEE Conference on Software Architecture (2008)Google Scholar
  29. 29.
    Yahoo User Interface Library, (accessed June, 2011)
  30. 30.
    Le Hors, A., Le Hégaret, P., Wood, L., Nicol, G., Robie, J., Champion, M., Byrne, S.: Document object model (DOM) level 3 core specification. W3C Recommendation (2004)Google Scholar
  31. 31.
    T. A. S. Foundation. Apache Orchestration Director Engine, (accessed June, 2011)
  32. 32.
    Google AJAX Language API, (accessed June, 2011)
  33. 33.
    Witt, A., Sasaki, F., Teich, E., Calzolari, N., Wittenburg, P.: Uses and usage of language resource-related standards. In: LREC 2008 Workshop (2008)Google Scholar
  34. 34.
    cURL, (accessed June, 2011)

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • João Zamite
    • 1
  • Fabrício A. B. Silva
    • 2
  • Francisco Couto
    • 1
  • Mário J. Silva
    • 1
  1. 1.LaSIGE, Faculty of ScienceUniversity of LisbonPortugal
  2. 2.Information Technology DivisionArmy Technology CenterRio de JaneiroBrazil

Personalised recommendations