Skip to main content

KDD Pipeline

  • Reference work entry
  • First Online:
  • 48 Accesses

Synonyms

Data mining pipeline; Data mining process; KDD process

Definition

The KDD pipeline describes the complete process of knowledge discovery in databases (KDD), i.e. the process of deriving useful, valid and non-trivial patterns from a large amount of data. The pipeline consists of five consecutive steps:

Selection

The selection step identifies the goal of the current application and selects a data set that is likely to contain relevant patterns.

Preprocessing

The preprocessing step increases the quality of the data set by supplementing missing attributes, removing duplicate instances and resolving data inconsistencies.

Transformation

The transformation step deletes correlated and irrelevant attributes and derives new more meaningful attributes from the current data description.

Data Mining

This step selects a data mining algorithm with respect to the goal which was identified in the selection step and derives patterns or learns functions that are valid for the current data set.

Ev...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Brachman R, Anand T. The process of knowledge discovery in databases: a human centered approach. In: Proceedings of the 10th National Conference on Artificial Intelligence; 1996. p. 37–8.

    Google Scholar 

  2. Fayyad U, Piatetsky-Shapiro G, Smyth P. From data mining to knowledge discovery in databases. In: Proceedings of the 10th National Conference on Artificial Intelligence; 1996. p. 1–30.

    Google Scholar 

  3. Fayyad U, Piatetsky-Shapiro G, Smyth P. Knowledge discovery and data mining: towards a unifying framework. In: Proceedings of the 2nd Internatinal Conference on Knowledge Discovery and Data Mining; 1996. p. 82–8.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hans-Peter Kriegel .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Kriegel, HP., Schubert, M. (2018). KDD Pipeline. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_1134

Download citation

Publish with us

Policies and ethics