Text Mining

Part of the series Theory and Applications of Natural Language Processing pp 41-62


Simple, Fast and Accurate Taxonomy Learning

* Final gross prices may vary according to local VAT.

Get Access


Although many algorithms have been developed to extract lexical resources, few organize the mined terms into taxonomies. We propose (1) a semi-supervised algorithm that uses a root term, a seed example and lexico-syntactic patterns to learn automatically from the Web hyponyms and hypernyms subordinated to the root; (2) a Web based concept positioning test to validate the learned terms and is-a relations; (3) a graph algorithm that induces from scratch the taxonomy structure of all terms and (4) a pattern-based procedure for enriching the learned taxonomies with verb-based relations. We conduct an exhaustive empirical evaluations on four different domains and show that our algorithm quickly and accurately acquires and taxonomies the knowledge. We conduct comparative studies against WordNet and existing knowledge repositories and show that our algorithm finds many additional terms and relations missing from these resources. We conduct an evaluation against other taxonomization algorithms and show how our algorithm can further enrich the taxonomies with verb-based relations.