What Is Data Mining and How Does It Work?

Calders, Toon; Custers, Bart

doi:10.1007/978-3-642-30487-3_2

Toon Calders⁵ &
Bart Custers⁶

Part of the book series: Studies in Applied Philosophy, Epistemology and Rational Ethics ((SAPERE,volume 3))

4042 Accesses
7 Citations

Abstract

Due to recent technological developments it became possible to generate and store increasingly larger datasets. Not the amount of data, however, but the ability to interpret and analyze the data, and to base future policies and decisions on the outcome of the analysis determines the value of data. The amounts of data collected nowadays not only offer unprecedented opportunities to improve decision procedures for companies and governments, but also hold great challenges. Many pre-existing data analysis tools did not scale up to the current data sizes. From this need, the research filed of data mining emerged. In this chapter we position data mining with respect to other data analysis techniques and introduce the most important classes of techniques developed in the area: pattern mining, classification, and clustering and outlier detection. Also related, supporting techniques such as pre-processing and database coupling are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adriaans, P., Zantinge, D.: Data mining. Addison Wesley Longman, Harlow (1996)
Google Scholar
Bailey, K.D.: Typologies and Taxonomies; an introduction to classification techniques. In: Quantitative Applications in the Social Sciences, vol. (102). SAGE Publications, Thousand Oaks (1994)
Google Scholar
Berry, M.J.A., Linoff, G.S.: Mastering Data Mining; the Art and Science of Customer Relationship Management. Wiley Computer Publishing, John Wiley & Sons, Inc., New York (2000)
Google Scholar
Fayyad, U.M., Uthurusamy, R.: Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD 1995), Montreal, Canada, August 20-21. AAAI Press (1995)
Google Scholar
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: The KDD Process for Extracting Useful Knowledge from Volumes of Data. Communications of the ACM 39(11) (1996a)
Google Scholar
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From Data Mining to Knowledge Discovery: An Overview. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining. AAAI Press/The MIT Press, Menlo Park, California (1996b)
Google Scholar
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals. Journal Data Mining and Knowledge Discovery 1(1) (1997)
Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. In: Gray, J. (Series ed.) The Morgan Kaufmann Series in Data Management Systems, 2nd edn. Morgan Kaufmann Publishers (March 2006)
Google Scholar
Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. MIT press (2001)
Google Scholar
Holsheimer, M., Siebes, A.: Data Mining: the Search for Knowledge in Databases. Report CS-R9406 Centrum voor Wiskunde en Informatica, Computer Science/Department of Algorithmics and Architecture (1991)
Google Scholar
National Research Council. For the Record; protecting electronic health information, Computer Science and Telecommunications Board, National Research Council. National Academic Press, Washington, DC (1997)
Google Scholar
OTA Report. Computer Profiling. In: Electronic Record Systems and Individual Privacy. OTA Report, Congress of the United States (1986)
Google Scholar
SPSS Inc. Data Mining with Confidence. SPSS Inc., Chicago (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Eindhoven University of Technology, Eindhoven, The Netherlands
Toon Calders
eLaw, Institute for Law in the Information Society, Leiden University, Leiden, The Netherlands
Bart Custers

Authors

Toon Calders
View author publications
You can also search for this author in PubMed Google Scholar
Bart Custers
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Toon Calders .

Editor information

Editors and Affiliations

, Faculty of Law, Leiden University, Leiden, 2300 RA, Netherlands
Bart Custers
, Faculty of Math and Computer Science, Eindhoven University of Technology, Eindhoven, 5600, Netherlands
Toon Calders
, Faculty of Law, Leiden University, Leiden, 2300 RA, Netherlands
Bart Schermer
, Faculty of Law, Haifa University, Mount Carmel, Haifa, 31905, Israel
Tal Zarsky

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Calders, T., Custers, B. (2013). What Is Data Mining and How Does It Work?. In: Custers, B., Calders, T., Schermer, B., Zarsky, T. (eds) Discrimination and Privacy in the Information Society. Studies in Applied Philosophy, Epistemology and Rational Ethics, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30487-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-30487-3_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30486-6
Online ISBN: 978-3-642-30487-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics