Date: 19 Oct 2006

From the data mine to the knowledge mill: Applying the principles of lexical analysis to the data mining and knowledge discovery process

* Final gross prices may vary according to local VAT.

Get Access

Abstract

This paper argues that the traditional approach to datamining is dominated by quantitative tools which assume knowledge to be inherent in the data: the data miners task simply being to find it. We propose, however, that true knowledge arises from an interaction between the information and the user.

The notion of user interaction in datamining demands a modified approach. An environment must be developed in which the user is encouraged to participate in an interactive learning cycle, where knowledge is progressively extracted from the data. The combined techniques of lexical approximation, hyper-text navigation and quantitative statistics can form the foundation stones of this “knowledge mill” by permitting a progressive entry into the information and the identification of trends not readily visible via other techniques. Such practices are no longer the exclusive domain of large corporations with in-house databases, but open to anyone wishing to explore internal or external data sets.