Abstract
We present a novel approach for epidemic data collection and integration based on the principles of interoperability and modularity. Accurate and timely epidemic models require large, fresh datasets. The World Wide Web, due to its explosion in data availability, represents a valuable source for epidemiological datasets. From an e-science perspective, collected data can be shared across multiple applications to enable the creation of dynamic platforms to extract knowledge from these datasets. Our approach, MEDCollector, addresses this problem by enabling data collection from multiple sources and its upload to the repository of an epidemic research information platform. Enabling the flexible use and configuration of services through workflow definition, MEDCollector is adaptable to multiple Web sources. Identified disease and location entities are mapped to ontologies, not only guaranteeing the consistency within gathered datasets but also allowing the exploration of relations between the mapped entities. MEDCollector retrieves data from the web and enables its packaging for later use in epidemic modeling tools.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Brownstein, J., Freifeld, C.: HealthMap: The development of automated real-time internet surveillance for epidemic intelligence. Euro. Surveill. 12(10), 71129 (2007)
Ginsberg, J., Mohebbi, M., Patel, R., Brammer, L., Smolinski, M., Brilliant, L.: Detecting influenza epidemics using search engine query data. Nature 457(7232), 1012–1014 (2008)
Mawudeku, A., Blench, M.: Global Public Health Intelligence Network (GPHIN). In: 7th Conference of the Association for Machine Translation in the Americas, pp. 8–12 (2006)
Van Noort, S., Muehlen, M., Rebelo, A., Koppeschaar, C., Lima, L., Gomes, M.: Gripenet: an internet-based system to monitor influenza-like illness uniformly across Europe. Euro. Surveill. 12(7), 5 (2007)
Twitter, http://www.twitter.com/ (accessed June, 2011)
Silva, M.J., Silva, F.A., Lopes, L.F., Couto, F.M.: Building a digital library for epidemic modelling. In: Proceedings of ICDL 2010, The International Conference on Digital Libraries, February 23-27, vol. 1, TERI Press, New Delhi (2010)
Zamite, J., Silva, F.A.B., Couto, F., Silva, M.J.: MEDCollector: Multisource epidemic data collector. In: Khuri, S., Lhotská, L., Pisanti, N. (eds.) ITBAM 2010. LNCS, vol. 6266, pp. 16–30. Springer, Heidelberg (2010), http://dx.doi.org/10.1007/978-3-642-15020-3_2
Noronha, N., Campos, J.P., Gomes, D., Silva, M.J., Borbinha, J.L.: A deposit for digital collections. In: Constantopoulos, P., Sølvberg, I.T. (eds.) ECDL 2001. LNCS, vol. 2163, pp. 200–212. Springer, Heidelberg (2001), http://dx.doi.org/10.1007/3-540-44796-2_18
Li, P., Castrillo, J., Velarde, G., Wassink, I., Soiland-Reyes, S., Owen, S., Withers, D., Oinn, T., Pocock, M., Goble, C., Oliver, S., Kell, D.: Performing statistical analyses on quantitative data in taverna workflows: an example using r and maxdbrowse to identify differentially-expressed genes from microarray data. BMC Bioinformatics 9(334) (August 2008)
Gibson, A., Gamble, M., Wolstencroft, K., Oinn, T., Goble, C.: The data playground: An intuitive workflow specification environment. In: IEEE International Conference on e-Science and Grid Computing, pp. 59–68 (2007)
Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurrency and Computation: Practice and Experience 18(10), 1039–1065 (2006)
Riedel, M., Memon, A., Memon, M., Mallmann, D., Streit, A., Wolf, F., Lippert, T., Venturi, V., Andreetto, P., Marzolla, M., Ferraro, A., Ghiselli, A., Hedman, F., Shah, Z.A., Salzemann, J., Da Costa, A., Breton, V., Kasam, V., Hofmann-Apitius, M., Snelling, D., van de Berghe, S., Li, V., Brewer, S., Dunlop, A., De Silva, N.: Improving e-Science with Interoperability of the e-Infrastructures EGEE and DEISA. In: International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, pp. 225–231 (2008)
Madoff, L., Yu, V.: ProMED-mail: an early warning system for emerging diseases. Clinical Infectious Diseases 39(2), 227–232 (2004)
European Center for Disease Prevention and Control (ECDC), http://www.ecdc.europa.eu/ (accessed June, 2011)
European Influenza Surveillance Network (EISN), http://www.ecdc.europa.eu/en/activities/surveillance/EISN/ (accessed June, 2011)
Marquet, R., Bartelds, A., van Noort, S., Koppeschaar, C., Paget, J., Schellevis, F., van der Zee, J.: Internet-based monitoring of influenza-like illness(ILI) in the general population of the Netherlands during the 2003 - 2004 influenza season. BMC Public Health 6(1), 242 (2006)
Durvasula, S., Guttmann, M., Kumar, A., Lamb, J., Mitchell, T., Oral, B., Pai, Y., Sedlack, T., Sharma, H., Sundaresan, S.: SOA Practitioners Guide, Part 2, SOA Reference Architecture (2006)
Garlan, D.: Using service-oriented architectures for socio-cultural analysis, http://acme.able.cs.cmu.edu/pubs/show.php?id=290
Alves, A., Arkin, A., Askary, S., Bloch, B., Curbera, F., Goland, Y., Kartha, N., Sterling, König, D., Mehta, V., Thatte, S., van der Rijn, D., Yendluri, P., Yiu, A.: Web services business process execution language version 2.0. OASIS Committee Draft (May 2006)
Lopes, L.F., Zamite, J., Tavares, B., Couto, F., Silva, F., Silva, M.J.: Automated social network epidemic data collector. In: INForum - Simpósio de Informática (September 2009)
Bodenreider, O.: The unified medical language system (umls): integrating biomedical terminology. Nucl. Acids Res. 32(suppl. 1), D267–D270 (2004), http://dx.doi.org/10.1093/nar/gkh061
GeoNames, http://www.geonames.org/ (accessed June, 2011)
Decker, B.: World geodetic system 1984 (1986)
Miles, A., Matthews, B., Wilson, M., Brickley, D.: SKOS Core: Simple knowledge organisation for the web. DCMI 5, 1–9
Business process modeling notation (bpmn) version 1.2, Tech. Rep. (January 2009), http://www.omg.org/spec/BPMN/1.2/PDF
Aboauf, E.: WireIt - a Javascript Wiring Library, http://javascript.neyric.com/wireit/ (accessed June, 2011)
Yahoo Pipes, http://pipes.yahoo.com/pipes (accessed June, 2011)
Sousa, J., Schmerl, B., Poladian, V., Brodsky, A.: uDesign: End-User Design Applied to Monitoring and Control Applications for Smart Spaces. In: Proceedings of the 2008 Working IFIP/IEEE Conference on Software Architecture (2008)
Yahoo User Interface Library, http://developer.yahoo.com/yui/ (accessed June, 2011)
Le Hors, A., Le Hégaret, P., Wood, L., Nicol, G., Robie, J., Champion, M., Byrne, S.: Document object model (DOM) level 3 core specification. W3C Recommendation (2004)
T. A. S. Foundation. Apache Orchestration Director Engine, http://ode.apache.org/ (accessed June, 2011)
Google AJAX Language API, http://code.google.com/apis/ajaxlanguage/ (accessed June, 2011)
Witt, A., Sasaki, F., Teich, E., Calzolari, N., Wittenburg, P.: Uses and usage of language resource-related standards. In: LREC 2008 Workshop (2008)
cURL, http://curl.haxx.se/ (accessed June, 2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Zamite, J., Silva, F.A.B., Couto, F., Silva, M.J. (2011). MEDCollector: Multisource Epidemic Data Collector. In: Hameurlain, A., Küng, J., Wagner, R., Böhm, C., Eder, J., Plant, C. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems IV. Lecture Notes in Computer Science, vol 6990. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23740-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-23740-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23739-3
Online ISBN: 978-3-642-23740-9
eBook Packages: Computer ScienceComputer Science (R0)