Journal of Medical Systems

, Volume 36, Issue 2, pp 475–481

Information Extraction Approaches to Unconventional Data Sources for “Injury Surveillance System”: the Case of Newspapers Clippings

  • Paola Berchialla
  • Cecilia Scarinzi
  • Silvia Snidero
  • Yousif Rahim
  • Dario Gregori
Original Paper

DOI: 10.1007/s10916-010-9492-1

Cite this article as:
Berchialla, P., Scarinzi, C., Snidero, S. et al. J Med Syst (2012) 36: 475. doi:10.1007/s10916-010-9492-1

Abstract

Injury Surveillance Systems based on traditional hospital records or clinical data have the advantage of being a well established, highly reliable source of information for making an active surveillance on specific injuries, like choking in children. However, they suffer the drawback of delays in making data available to the analysis, due to inefficiencies in data collection procedures. In this sense, the integration of clinical based registries with unconventional data sources like newspaper articles has the advantage of making the system more useful for early alerting. Usage of such sources is difficult since information is only available in the form of free natural-language documents rather than structured databases as required by traditional data mining techniques. Information Extraction (IE) addresses the problem of transforming a corpus of textual documents into a more structured database. In this paper, on a corpora of Italian newspapers articles related to choking in children due to ingestion/inhalation of foreign body we compared the performance of three IE algorithms- (a) a classical rule based system which requires a manual annotation of the rules; (ii) a rule based system which allows for the automatic building of rules; (b) a machine learning method based on Support Vector Machine. Although some useful indications are extracted from the newspaper clippings, this approach is at the time far from being routinely implemented for injury surveillance purposes.

Keywords

Injury surveillance systems Text analysis Injury prevention Public health Data mining 

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Paola Berchialla
    • 1
  • Cecilia Scarinzi
    • 2
  • Silvia Snidero
    • 3
  • Yousif Rahim
    • 4
  • Dario Gregori
    • 1
    • 5
  1. 1.Department of Public Health and MicrobiologyUniversity of TorinoTorinoItaly
  2. 2.Department of Statistics and Applied Mathematics D. de CastroUniversity of TorinoTorinoItaly
  3. 3.S&A S.r.l.CuneoItaly
  4. 4.International Society for Violence and Injury PreventionStockholmNorway
  5. 5.Department of Environmental Medicine and Public HealthPadovaItaly