Parallelism in Knowledge Discovery Techniques

Talia, Domenico

doi:10.1007/3-540-48051-X_14

Domenico Talia⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2367))

Included in the following conference series:

International Workshop on Applied Parallel Computing

513 Accesses
12 Citations
3 Altmetric

Abstract

Knowledge discovery in databases or data mining is the semi-automated analysis of large volumes of data, looking for the relationships and knowledge that are implicit in large volumes of data and are ’interesting’ in the sense of impacting an organization’s practice. Data mining and knowledge discovery on large amounts of data can benefit of the use of parallel computers both to improve performance and quality of data selection. This paper presents and discusses different forms of parallelism that can be exploited in data mining techniques and algorithms. For the main data mining techniques, such as rule induction, clustering algorithms, decision trees, genetic algorithms, and neural networks, the possible ways to exploit parallelism are presented and discussed in detail. Finally, some promising research directions in the parallel data mining research area are outlined.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Agrawal and R. Srikant, Fast Algorithms for Mining Association Rules, Proc. of the 20th Int’l Conference on Very Large Databases, Santiago, Chile, 1994.
Google Scholar
R. Agrawal and J.C. Shafer, Parallel Mining of Association Rules, IEEE Transactions on Knowledge and Data Engineering, 8, 1996.
Google Scholar
M.J.A. Berry and G. Linoff, Data Mining Techniques for Marketing, Sales, and Customer Support, Wiley Computer Publishing, 1997.
Google Scholar
J.P. Bigus, Data Mining with Neural Networks, McGraw-Hill, New York, 1996.
Google Scholar
M. Bruynooghe, Parallel Implementation of Fast Clustering Algorithms, Proc. Int. Symp. On High Performance Computing, pp. 65–78, 1989.
Google Scholar
M. Cannataro, D. Talia and P. Trunfio, KNOWLEDGE GRID: High Performance Knowledge Discovery Services on the Grid, Proc. 2nd Int. Workshop GRID 2001, Denver, CO, LNCS 2242, Springer-Verlag, pp. 38–50, November 2001.
Google Scholar
D. Foti, D. Lipari, C. Pizzuti and D. Talia, Scalable Parallel Clustering for Data Mining on Multicomputers, Proc. of the 3rd Int. Workshop on High Performance Data Mining HPDM00-IPDPS, Cancun, LNCS 1800, pp. 390–398, Springer-Verlag, 2000.
Google Scholar
A.A. Freitas and S.H. Lavington, Mining Very Large Database with Parallel Processing, Kluwer Academic Publishers, 1998.
Google Scholar
E.-H. Han, G. Karypis and V. Kumar, Scalable Parallel Data Mining for Association Rules, IEEE Transactions on Knowledge and Data Engineering, 1999.
Google Scholar
D. Judd, K. McKinley and A.K. Jain, Large-Scale Parallel Data Clustering, Proc. Int. Conf. On Pattern Recognition, Vienna, 1996.
Google Scholar
R. Kufrin, Generating C4.5 Production Rules in Parallel, Proc. 14th Nat. Conf. on Artificial Intelligence-AAAI-97, AAAI Press, 1997.
Google Scholar
X. Li and Z. Fang, Parallel Clustering Algorithms, Parallel Computing, 11, pp. 275–290, 1989.
Article MathSciNet MATH Google Scholar
F. Neri and A. Giordana, A Parallel Genetic Algorithm for Concept Learning, Proc. 6th Int. Conf. Genetic Algorithms, pp. 436–443, 1995.
Google Scholar
C.F. Olson, Parallel Algorithms for Hierarchical Clustering, Parallel Computing, 21, pp. 1313–1325, 1995.
Article MathSciNet MATH Google Scholar
R.A. Pearson, A Coarse-grained Parallel Induction Heuristic, in: H. Kitano, V. Kumar, C.B. Suttner (Eds.), Parallel Processing for Artificial Intelligence 2, Elsevier Science, pp. 207–226, 1994.
Google Scholar
J. Shafer, R. Agrawal and M. Mehta, SPRINT: A Scalable Parallel Classifier for Data Mining, Proc. 22nd Int. Conf. Very Large Databases-VLDB-96, Bombay, 1996.
Google Scholar
D. Skillicorn, Strategies for Parallel Data Mining, IEEE Concurrency, 7:4, pp. 26–35, 1999.
Article Google Scholar
M.J. Zaki, Parallel and Distributed Association Mining: A Survey, IEEE Concurrency, 7:4, pp. 14–25, 1999.
Article Google Scholar

Download references

Author information

Authors and Affiliations

DEIS, Università della Calabria, Via P. Bucci, 41c, 87036, Rende, Italy
Domenico Talia

Authors

Domenico Talia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CSC, P.O. Box 405, 02101, Espoo, Finland
Juha Fagerholm , Juha Haataja , Jari Järvinen , Mikko Lyly , Peter Råback & Ville Savolainen , , , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Talia, D. (2002). Parallelism in Knowledge Discovery Techniques. In: Fagerholm, J., Haataja, J., Järvinen, J., Lyly, M., Råback, P., Savolainen, V. (eds) Applied Parallel Computing. PARA 2002. Lecture Notes in Computer Science, vol 2367. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48051-X_14

Download citation

DOI: https://doi.org/10.1007/3-540-48051-X_14
Published: 04 July 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43786-4
Online ISBN: 978-3-540-48051-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics