Knowledge and Information Systems

, Volume 15, Issue 3, pp 285–320

Boosting text segmentation via progressive classification

  • Eugenio Cesario
  • Francesco Folino
  • Antonio Locane
  • Giuseppe Manco
  • Riccardo Ortale
Regular Paper

Abstract

A novel approach for reconciling tuples stored as free text into an existing attribute schema is proposed. The basic idea is to subject the available text to progressive classification, i.e., a multi-stage classification scheme where, at each intermediate stage, a classifier is learnt that analyzes the textual fragments not reconciled at the end of the previous steps. Classification is accomplished by an ad hoc exploitation of traditional association mining algorithms, and is supported by a data transformation scheme which takes advantage of domain-specific dictionaries/ontologies. A key feature is the capability of progressively enriching the available ontology with the results of the previous stages of classification, thus significantly improving the overall classification accuracy. An extensive experimental evaluation shows the effectiveness of our approach.

Keywords

Schema reconciliation Text segmentation Classification 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag London Limited 2007

Authors and Affiliations

  • Eugenio Cesario
    • 1
  • Francesco Folino
    • 1
  • Antonio Locane
    • 1
  • Giuseppe Manco
    • 1
  • Riccardo Ortale
    • 1
  1. 1.ICAR-CNRRende(CS)Italy

Personalised recommendations