Automatic Dictionary- and Rule-Based Systems for Extracting Information from Text

  • Sergio BolascoEmail author
  • Pasquale Pavone
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


The paper offers a general introduction to the use of meta-information in a text mining perspective. The aim is to build a meta-dictionary as an available linguistic resource useful for different applications. The procedure is based on the use of a hybrid system. The suggested algorithm employs, conjointly and in a recursive way, dictionaries and rules, the latter both lexical and textual. An application on a corpus of diaries from the Time Use Survey (TUS) by Istat is illustrated.


Hybrid System Regular Expression Text Mining Linguistic Resource Textual Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Balbi, S., Bolasco, S., & Verde, R. (2002). Text mining on elementary forms in complex lexical structures. In A. Morin & P. Sébillot (Eds.), JADT 2002 (pp. 89–100), St. Malo, March 13–15, IRISA-INRIA, Rennes.Google Scholar
  2. Basili, R., & Moschitti, A. (2005). Automatic text categorization. From information retrieval to support vector learning. Rome: Aracne.Google Scholar
  3. Bolasco, S. (2005). Statistica testuale e text mining: Alcuni paradigmi applicativi. Quaderni di Statistica, 7, 17–53.Google Scholar
  4. Bolasco, S., D’Avino, E., & Pavone, P. (2007). Analisi dei diari giornalieri con strumenti di statistica testuale e text mining. In M. C. Romano (Ed.), I tempi della vita quotidiana. Un approccio multidisciplinare all’analisi dell’uso del tempo (pp. 309–340). Rome: ISTAT.Google Scholar
  5. Pazienza, M. T. (Ed.) (2003). Information extraction in the Web era. Natural language communication for knowledge acquisition and intelligent information agents, Lecture Notes in Computer Science (Vol. 2700). Berlin: Springer.Google Scholar
  6. Poibeau, T. (2003). Extraction automatique d’information. Paris: Hermes Lavoisier.Google Scholar
  7. Salton, G. (1989). Automatic text processing: The transformation, analysis and retrieval of information by computer. Reading, MA: Addison-Wesley.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  1. 1.Dipartimento di Studi Geoeconomici, Linguistici, Statistici, Storici per l’Analisi Regionale, SapienzaUniversity of RomeRomaItaly

Personalised recommendations