Parallel induction algorithms for data mining

  • John Darlington
  • Yi -ke Guo
  • Janjao Sutiwaraphun
  • Hing Wing To
Conference paper

DOI: 10.1007/BFb0052860

Part of the Lecture Notes in Computer Science book series (LNCS, volume 1280)
Cite this paper as:
Darlington J., Guo Y.., Sutiwaraphun J., To H.W. (1997) Parallel induction algorithms for data mining. In: Liu X., Cohen P., Berthold M. (eds) Advances in Intelligent Data Analysis Reasoning about Data. IDA 1997. Lecture Notes in Computer Science, vol 1280. Springer, Berlin, Heidelberg

Abstract

In the last decade, there has been an explosive growth in the generation and collection of data. Nonetheless, the quality of information inferred from this voluminous data has not been proportional to its size. One of the reasons for this is that the computational complexities of the algorithms used to extract information from the data are normally proportional to the number of input data items resulting in prohibitive execution time on large data sets. Parallelism is one solution to this problem. In this paper we present preliminary results on experiments in parallelising C4.5, a classification-rule learning system using decision-trees as a model representation, which has been used as a base model for investigating methods for parallelising induction algorithms. The experiments assess the potential for improving the execution time by exploiting parallelism in the algorithm.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag 1997

Authors and Affiliations

  • John Darlington
    • 1
  • Yi -ke Guo
    • 1
  • Janjao Sutiwaraphun
    • 1
  • Hing Wing To
    • 1
  1. 1.Department of ComputingImperial CollegeLondonUK

Personalised recommendations