What Is Data Mining and How Does It Work?
Due to recent technological developments it became possible to generate and store increasingly larger datasets. Not the amount of data, however, but the ability to interpret and analyze the data, and to base future policies and decisions on the outcome of the analysis determines the value of data. The amounts of data collected nowadays not only offer unprecedented opportunities to improve decision procedures for companies and governments, but also hold great challenges. Many pre-existing data analysis tools did not scale up to the current data sizes. From this need, the research filed of data mining emerged. In this chapter we position data mining with respect to other data analysis techniques and introduce the most important classes of techniques developed in the area: pattern mining, classification, and clustering and outlier detection. Also related, supporting techniques such as pre-processing and database coupling are discussed.
KeywordsData Mining Outlier Detection Pattern Mining Association Rule Mining Class Boundary
Unable to display preview. Download preview PDF.
- Adriaans, P., Zantinge, D.: Data mining. Addison Wesley Longman, Harlow (1996)Google Scholar
- Bailey, K.D.: Typologies and Taxonomies; an introduction to classification techniques. In: Quantitative Applications in the Social Sciences, vol. (102). SAGE Publications, Thousand Oaks (1994)Google Scholar
- Berry, M.J.A., Linoff, G.S.: Mastering Data Mining; the Art and Science of Customer Relationship Management. Wiley Computer Publishing, John Wiley & Sons, Inc., New York (2000)Google Scholar
- Fayyad, U.M., Uthurusamy, R.: Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD 1995), Montreal, Canada, August 20-21. AAAI Press (1995)Google Scholar
- Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: The KDD Process for Extracting Useful Knowledge from Volumes of Data. Communications of the ACM 39(11) (1996a)Google Scholar
- Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From Data Mining to Knowledge Discovery: An Overview. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining. AAAI Press/The MIT Press, Menlo Park, California (1996b)Google Scholar
- Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals. Journal Data Mining and Knowledge Discovery 1(1) (1997)Google Scholar
- Han, J., Kamber, M.: Data Mining: Concepts and Techniques. In: Gray, J. (Series ed.) The Morgan Kaufmann Series in Data Management Systems, 2nd edn. Morgan Kaufmann Publishers (March 2006)Google Scholar
- Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. MIT press (2001)Google Scholar
- Holsheimer, M., Siebes, A.: Data Mining: the Search for Knowledge in Databases. Report CS-R9406 Centrum voor Wiskunde en Informatica, Computer Science/Department of Algorithmics and Architecture (1991)Google Scholar
- National Research Council. For the Record; protecting electronic health information, Computer Science and Telecommunications Board, National Research Council. National Academic Press, Washington, DC (1997)Google Scholar
- OTA Report. Computer Profiling. In: Electronic Record Systems and Individual Privacy. OTA Report, Congress of the United States (1986)Google Scholar
- SPSS Inc. Data Mining with Confidence. SPSS Inc., Chicago (1999)Google Scholar