Parallel and Distributed Processing

Volume 1800 of the series Lecture Notes in Computer Science pp 358-365


A Requirements Analysis for Parallel KDD Systems

  • William A. ManiattyAffiliated withComputer Science Dept., University at Albany
  • , Mohammed J. ZakiAffiliated withComputer Science Dept., Rensselaer Polytechnic Institute

* Final gross prices may vary according to local VAT.

Get Access


The current generation of data mining tools have limited capacity and performance, since these tools tend to be sequential. This paper explores a migration path out of this bottleneck by considering an integrated hardware and software approach to parallelize data mining. Our analysis shows that parallel data mining solutions require the following components: parallel data mining algorithms, parallel and distributed data bases, parallel file systems, parallel I/O, tertiary storage, management of online data, support for heterogeneous data representations, security, quality of service and pricing metrics. State of the art technology in these areas is surveyed with an eye towards an integration strategy leading to a complete solution.