KNOWLEDGE GRID: High Performance Knowledge Discovery Services on the Grid

  • Mario Cannataro
  • Domenico Talia
  • Paolo Trunfio
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2242)


Knowledge discovery tools and techniques are used in an increasing number of scientific and commercial areas for the analysis of large data sets. When large data repositories are coupled with geographic distribution of data, users and systems, it is necessary to combine different technologies for implementing high-performance distributed knowledge discovery systems. On the other hand, computational grid is emerging as a very promising infrastructure for high-performance distributed computing. In this paper we introduce a software architecture for parallel and distributed knowledge discovery (PDKD) systems that is built on top of computational grid services that provide dependable, consistent, and pervasive access to high-end computational resources. The proposed architecture uses the grid services and defines a set of additional layers to implement the services of distributed knowledge discovery process on grid-connected sequential or parallel computers.


Grid Service Execution Plan Grid Infrastructure Data Mining Tool Distribute Data Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chattratichat J., Darlington J., Guo Y., Hedvall S., Koler M. and Syed J., An architecture for distributed enterprise data mining. HPCN Europe 1999, Lecture Notes in Computer Science, 1593, 1999, pp. 573–582.Google Scholar
  2. 2.
    Chervenak A., Foster I., Kesselman C, Salisbury C. and Tuecke S., The Data Grid: towards an architecture for the distributed management and analysis of large scientific data sets. Journal of Network and Computer Appls, 2001.Google Scholar
  3. 3.
    Fayyad U.M. and Uthurusamy R. (eds.), Data mining and knowledge discovery in databases. Communications of the ACM 39, 1997.Google Scholar
  4. 4.
    Foster I. and Kesselman C, Globus: a metacomputing infrastructure toolkit. International Journal of Supercomputing Applications 11, 1997, pp. 115–128.CrossRefGoogle Scholar
  5. 5.
    Freitas A.A. and Lavington S.H., Mining Very Large Databases with Parallel Processing, Kluwer Academic Publishers, 1998.Google Scholar
  6. 6.
    Grimshaw A.S., Ferrari A., Knabe F., and Humphrey M., Wide-area computing: resource sharing on a large scale. Computer 32, 1999, pp. 29–37.CrossRefGoogle Scholar
  7. 7.
    Grossman R., Bailey S., Kasif S., Mon D., Ramu A. and Malhi B., The preliminary design of papyrus: a system for high performance, distributed data mining over clusters, meta-clusters and super-clusters. International KDD’98 Conference, 1998, pp. 37–43.Google Scholar
  8. 8.
    Kargupta H., Park B., Hershberger, D. and Johnson, E., Collective data mining: a new perspective toward distributed data mining. In H. Kargupta and P. Chan (eds.) Advances in Distributed and Parallel Knowledge Discovery, AAAI Press 1999.Google Scholar
  9. 9.
    Kimm H. and Ryu T.-W., A framework for distributed knowledge discovery system over heterogeneous networks using CORBA. KDD2000 Workshop on Distributed and Parallel Knowledge Discovery, 2000.Google Scholar
  10. 10.
    D. Foti, D. Lipari, C. Pizzuti, D. Talia, “Scalable Parallel Clustering for Data Mining on Multicomputers”, Proc. of the 3rd Int. Workshop on High Performance Data Mining HPDM00-1PDPS, LNCS, Springer-Verlag, Cancun, Mexico, May 2000.Google Scholar
  11. 11.
    Moore R., Baru C, Marciano R., Rajasekar A. and Wan M., Data-intensive computing. In I. Foster and C. Kesselman (eds.) The Grid: Blueprint for a Future Computing Inf., Morgan Kaufmann Publishers, 1999, pp. 105–129.Google Scholar
  12. 12.
    Rana O.F., Walker D.W., Li M., Lynden S. and Ward M., PaDDMAS: parallel and distributed data mining application suite. Proc. International Parallel and Distributed Processing Symposium (IPDPS/SPDP), IEEE Computer Society Press, 2000, pp. 387–392.Google Scholar
  13. 13.
    Stolfo S.J., Prodromidis A.L., Tselepis S., Lee W., Fan D.W., Chan P.K., JAM: Java agents for meta-learning over distributed databases. International KDD’97 Conference, 1997, pp. 74–81.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Mario Cannataro
    • 1
  • Domenico Talia
    • 2
  • Paolo Trunfio
    • 1
  1. 1.ISI-CNRRende (CS)Italy
  2. 2.DEISUniversita della CalabriaRende (CS)Italy

Personalised recommendations