Speeding up knowledge discovery in large relational databases by means of a new discretization algorithm

  • Alex Alves Freitas
  • Simon H. Lavington
Technical Papers Optimisation/Performance Issues

DOI: 10.1007/3-540-61442-7_8

Part of the Lecture Notes in Computer Science book series (LNCS, volume 1094)
Cite this paper as:
Freitas A.A., Lavington S.H. (1996) Speeding up knowledge discovery in large relational databases by means of a new discretization algorithm. In: Morrison R., Kennedy J. (eds) Advances in Databases. BNCOD 1996. Lecture Notes in Computer Science, vol 1094. Springer, Berlin, Heidelberg

Abstract

Most of the KDD (Knowledge Discovery in Databases) algorithms proposed in the literature have been applied to relatively small datasets and do not permit any integration with a DBMS. Hence, the application of these algorithms to the huge amounts of data found in current databases and data warehouses faces serious scalability problems, particularly the problem of excessive learning time. This paper investigates a way of improving the scalability of KDD algorithms, via discretization of ordinal or continuous attributes. This work has two novel aspects. First, we map a generic discretization primitive into an SQL query. Second, we propose a new discretization algorithm for classification tasks. We show how the new discretization algorithm can be implemented with good effect via the SQL primitive.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag 1996

Authors and Affiliations

  • Alex Alves Freitas
    • 1
  • Simon H. Lavington
    • 1
  1. 1.Dept. of Computer ScienceUniversity of EssexColchesterUK

Personalised recommendations