Abstract
Knowledge discovery tools and techniques are used in an increasing number of scientific and commercial areas for the analysis of large data sets. When large data repositories are coupled with geographic distribution of data, users and systems, it is necessary to combine different technologies for implementing high-performance distributed knowledge discovery systems. On the other hand, computational grid is emerging as a very promising infrastructure for high-performance distributed computing. In this paper we introduce a software architecture for parallel and distributed knowledge discovery (PDKD) systems that is built on top of computational grid services that provide dependable, consistent, and pervasive access to high-end computational resources. The proposed architecture uses the grid services and defines a set of additional layers to implement the services of distributed knowledge discovery process on grid-connected sequential or parallel computers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chattratichat J., Darlington J., Guo Y., Hedvall S., Koler M. and Syed J., An architecture for distributed enterprise data mining. HPCN Europe 1999, Lecture Notes in Computer Science, 1593, 1999, pp. 573–582.
Chervenak A., Foster I., Kesselman C, Salisbury C. and Tuecke S., The Data Grid: towards an architecture for the distributed management and analysis of large scientific data sets. Journal of Network and Computer Appls, 2001.
Fayyad U.M. and Uthurusamy R. (eds.), Data mining and knowledge discovery in databases. Communications of the ACM 39, 1997.
Foster I. and Kesselman C, Globus: a metacomputing infrastructure toolkit. International Journal of Supercomputing Applications11, 1997, pp. 115–128.
Freitas A.A. and Lavington S.H., Mining Very Large Databases with Parallel Processing, Kluwer Academic Publishers, 1998.
Grimshaw A.S., Ferrari A., Knabe F., and Humphrey M., Wide-area computing: resource sharing on a large scale. Computer32, 1999, pp. 29–37.
Grossman R., Bailey S., Kasif S., Mon D., Ramu A. and Malhi B., The preliminary design of papyrus: a system for high performance, distributed data mining over clusters, meta-clusters and super-clusters. International KDD’98 Conference, 1998, pp. 37–43.
Kargupta H., Park B., Hershberger, D. and Johnson, E., Collective data mining: a new perspective toward distributed data mining. In H. Kargupta and P. Chan (eds.) Advances in Distributed and Parallel Knowledge Discovery, AAAI Press 1999.
Kimm H. and Ryu T.-W., A framework for distributed knowledge discovery system over heterogeneous networks using CORBA. KDD2000 Workshop on Distributed and Parallel Knowledge Discovery, 2000.
D. Foti, D. Lipari, C. Pizzuti, D. Talia, “Scalable Parallel Clustering for Data Mining on Multicomputers”, Proc. of the 3rd Int. Workshop on High Performance Data Mining HPDM00-1PDPS, LNCS, Springer-Verlag, Cancun, Mexico, May 2000.
Moore R., Baru C, Marciano R., Rajasekar A. and Wan M., Data-intensive computing. In I. Foster and C. Kesselman (eds.) The Grid: Blueprint for a Future Computing Inf., Morgan Kaufmann Publishers, 1999, pp. 105–129.
Rana O.F., Walker D.W., Li M., Lynden S. and Ward M., PaDDMAS: parallel and distributed data mining application suite. Proc. International Parallel and Distributed Processing Symposium (IPDPS/SPDP), IEEE Computer Society Press, 2000, pp. 387–392.
Stolfo S.J., Prodromidis A.L., Tselepis S., Lee W., Fan D.W., Chan P.K., JAM: Java agents for meta-learning over distributed databases. International KDD’97 Conference, 1997, pp. 74–81.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cannataro, M., Talia, D., Trunfio, P. (2001). KNOWLEDGE GRID: High Performance Knowledge Discovery Services on the Grid. In: Lee, C.A. (eds) Grid Computing — GRID 2001. GRID 2001. Lecture Notes in Computer Science, vol 2242. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45644-9_5
Download citation
DOI: https://doi.org/10.1007/3-540-45644-9_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42949-4
Online ISBN: 978-3-540-45644-5
eBook Packages: Springer Book Archive