Skip to main content

Parallelism in Knowledge Discovery Techniques

  • Conference paper
  • First Online:
Applied Parallel Computing (PARA 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2367))

Included in the following conference series:

Abstract

Knowledge discovery in databases or data mining is the semi-automated analysis of large volumes of data, looking for the relationships and knowledge that are implicit in large volumes of data and are ’interesting’ in the sense of impacting an organization’s practice. Data mining and knowledge discovery on large amounts of data can benefit of the use of parallel computers both to improve performance and quality of data selection. This paper presents and discusses different forms of parallelism that can be exploited in data mining techniques and algorithms. For the main data mining techniques, such as rule induction, clustering algorithms, decision trees, genetic algorithms, and neural networks, the possible ways to exploit parallelism are presented and discussed in detail. Finally, some promising research directions in the parallel data mining research area are outlined.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agrawal and R. Srikant, Fast Algorithms for Mining Association Rules, Proc. of the 20th Int’l Conference on Very Large Databases, Santiago, Chile, 1994.

    Google Scholar 

  2. R. Agrawal and J.C. Shafer, Parallel Mining of Association Rules, IEEE Transactions on Knowledge and Data Engineering, 8, 1996.

    Google Scholar 

  3. M.J.A. Berry and G. Linoff, Data Mining Techniques for Marketing, Sales, and Customer Support, Wiley Computer Publishing, 1997.

    Google Scholar 

  4. J.P. Bigus, Data Mining with Neural Networks, McGraw-Hill, New York, 1996.

    Google Scholar 

  5. M. Bruynooghe, Parallel Implementation of Fast Clustering Algorithms, Proc. Int. Symp. On High Performance Computing, pp. 65–78, 1989.

    Google Scholar 

  6. M. Cannataro, D. Talia and P. Trunfio, KNOWLEDGE GRID: High Performance Knowledge Discovery Services on the Grid, Proc. 2nd Int. Workshop GRID 2001, Denver, CO, LNCS 2242, Springer-Verlag, pp. 38–50, November 2001.

    Google Scholar 

  7. D. Foti, D. Lipari, C. Pizzuti and D. Talia, Scalable Parallel Clustering for Data Mining on Multicomputers, Proc. of the 3rd Int. Workshop on High Performance Data Mining HPDM00-IPDPS, Cancun, LNCS 1800, pp. 390–398, Springer-Verlag, 2000.

    Google Scholar 

  8. A.A. Freitas and S.H. Lavington, Mining Very Large Database with Parallel Processing, Kluwer Academic Publishers, 1998.

    Google Scholar 

  9. E.-H. Han, G. Karypis and V. Kumar, Scalable Parallel Data Mining for Association Rules, IEEE Transactions on Knowledge and Data Engineering, 1999.

    Google Scholar 

  10. D. Judd, K. McKinley and A.K. Jain, Large-Scale Parallel Data Clustering, Proc. Int. Conf. On Pattern Recognition, Vienna, 1996.

    Google Scholar 

  11. R. Kufrin, Generating C4.5 Production Rules in Parallel, Proc. 14th Nat. Conf. on Artificial Intelligence-AAAI-97, AAAI Press, 1997.

    Google Scholar 

  12. X. Li and Z. Fang, Parallel Clustering Algorithms, Parallel Computing, 11, pp. 275–290, 1989.

    Article  MathSciNet  MATH  Google Scholar 

  13. F. Neri and A. Giordana, A Parallel Genetic Algorithm for Concept Learning, Proc. 6th Int. Conf. Genetic Algorithms, pp. 436–443, 1995.

    Google Scholar 

  14. C.F. Olson, Parallel Algorithms for Hierarchical Clustering, Parallel Computing, 21, pp. 1313–1325, 1995.

    Article  MathSciNet  MATH  Google Scholar 

  15. R.A. Pearson, A Coarse-grained Parallel Induction Heuristic, in: H. Kitano, V. Kumar, C.B. Suttner (Eds.), Parallel Processing for Artificial Intelligence 2, Elsevier Science, pp. 207–226, 1994.

    Google Scholar 

  16. J. Shafer, R. Agrawal and M. Mehta, SPRINT: A Scalable Parallel Classifier for Data Mining, Proc. 22nd Int. Conf. Very Large Databases-VLDB-96, Bombay, 1996.

    Google Scholar 

  17. D. Skillicorn, Strategies for Parallel Data Mining, IEEE Concurrency, 7:4, pp. 26–35, 1999.

    Article  Google Scholar 

  18. M.J. Zaki, Parallel and Distributed Association Mining: A Survey, IEEE Concurrency, 7:4, pp. 14–25, 1999.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Talia, D. (2002). Parallelism in Knowledge Discovery Techniques. In: Fagerholm, J., Haataja, J., Järvinen, J., Lyly, M., Råback, P., Savolainen, V. (eds) Applied Parallel Computing. PARA 2002. Lecture Notes in Computer Science, vol 2367. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48051-X_14

Download citation

  • DOI: https://doi.org/10.1007/3-540-48051-X_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43786-4

  • Online ISBN: 978-3-540-48051-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics