Abstract
Data Mining, sometimes referred to as Knowledge Discovery in Databases (KDD), applies artificial intelligence, pattern recognition, and database techniques for the commercial analysis and exploitation of large amounts of data. No discussion of modern knowledge-based systems would be complete without mention of this important topic. Data Mining may be oriented towards discovering (a) summary descriptions and visualization of collections of data, (b) finding correlations among data attributes, (c) discriminating among classes using attribute values, (d) predicting values for output attributes, (e) identifying groups of similar data, or (f ) using historical information to predict the future values of variables. The following are some examples of data mining applications:
-
Fraud detection in the use of credit cards and accounts
-
Determination of the most appropriate target markets for a product
-
Association of market segments with specific marketing strategies
-
Analysis of medical histories to evaluate the risk of inheriting a disease
-
Determining consumption and usage patterns of customers
-
Projections of demand and supply of consumer products
-
Creditworthiness evaluation of loan applicants
-
Stock market predictions
Dirt-covered, a diamond lay hidden in the marketplace; fools passed, but only the wise would know its face, Kabirdas, c.1450
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Bibliography
R. Agrawal, T. Imilienski, and A. Swami, “Mining association rules between sets of items in large databases,” in Proc. ACM SIGMOD International Conf. on Management of Data, 1993, pp.207–216.
R. Agrawal, K.-I. Lin, H.S. Sawhney, and K. Shim, “Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time Series Databases,” Proc. 21st International Conf. on Very Large Data Bases, Sept. 1995, pp.490–501.
R. Agrawal, M. Mehta, J. Shafer, R. Srikant, A. Arning, and T. Bollinger, “The Quest data mining system,” in Proc. 1996 International Conf. on Data Mining and Knowledge Discovery (KDD′96), Portland (OR), Aug. 1996.
R. Brause, T. Langsdorf, and M. Hepp, “Neural data mining for credit card fraud detection,” in Proc.Eleventh International Conf. on Tools with Artificial Intelligence, Nov. 1999, pp.103–106.
P.K. Chan and S.J. Stolfo, “Learning arbiter and combiner trees from partitioned data for scaling machine learning,” in Proc. First International Conf. on Knowledge Discovery and Data Mining (KDD′95), Aug. 1995, pp.39–44.
M.-S. Chen, J. Han and P.S. Yu, “Data Mining: An Overview from Database Perspective,” IEEE Transactions on Knowledge and Data Engineering, 1996, 8(6):866–883.
D.W. Cheung, J. Han, V. Ng, and C.Y. Wong, “Maintenance of discovered association rules in large databases: An incremental updating technique,” in Proc. 1996 International Conf. on Data Engineering, New Orleans (LA), Feb. 1996.
M. Ester, H.P. Kriegel, and X. Xu, “Knowledge discovery in large spatial databases: Focusing techniques for efficient class identification,” in Proc. 4th International Symp. on Large Spatial Databases (SSD′95), Portland (ME), Aug. 1995, pp.67–82.
C. Faloutsos and K.-I. Lin, “FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets,” in Proc. ACM SIGMOD International Conf. on Management of Data, May 1995, pp.163–174.
U. Fayyad, G. Piatetsky-Shapiro, P. Smyth and U. Uthurasamy (Eds.), Advances in Knowledge Discovery and Data Mining, AAAI Press, Menlo Park (CA), 1996.
S.I. Gallant, “Optimal linear discriminants,” in Proc. Eighth International Conf. on Pattern Recognition, 1986, pp.849–852.
J. Han, Y. Cai, and N. Cercone, “Data-driven discovery of quantitative rules in relational databases,” IEEE Trans, on Knowledge and Data, Engineering, 1993, 5:29–40.
J. Han and Y. Fu, “Discovery of multi-level association rules from large databases,” in Proc. 21st International Conf. on Very Large Data Bases, 1995, pp.420–431.
J. Han and Y. Fu, “Exploration of the power of attribute-oriented induction in data mining,” in U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.) Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, 1996, pp.399–421.
S. Hay kin, Neural Networks: A Comprehensive Foundation, second edition, Prentice-Hall, 1999.
T. Imielinski and A. Virmani, “Datamine — application programming interface and query language for KDD applications,” in Proc. 1996 International Conf. on Data Mining and Knowledge Discovery (KDD′96), Portland (OR), Aug. 1996.
V. Iyengar, “HOT: Heuristics for Oblique Trees,” in Proc.Eleventh International Conf. on Tools with Artificial Intelligence, Nov. 1999, pp.91–98.
M. James, Classification Algorithms, Wiley, NY, 1985.
A.K. Jain and R.C. Dubes, Algorithms for clustering data, Prentice Hall, 1988.
R.A. Johnson, Miller and Freund’s Probability and Statistics for Engineers, fifth edition, Prentice-Hall, 1994.
L. Kaufman and P.J. Rousseeeuw, Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, 1990.
W. Kim, K. Mehrotra and C.K. Mohan, “Fuzzy Adaptive Multimodule Approximation Network,” in Proc. NAFIPS International Conf, June 1999.
W. Klosgen, “Explora: A multipattern and multistrategy discovery assistant,” in U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.) Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, 1996, pp.249–271.
C.S. Li, P.S, Yu, and V. Castelli, “HierarchyScan: A Hierarchical Similarity Search Algorithm for Databases of Long Sequences,” in Proc. 12th International Conf. on Data Engineering, Feb. 1996.
C.J. Matheus, G. Piatetsky-Shapiro, and D. McNeil, “Selecting and reporting what is interesting: The KEFIR application to healthcare data,” in U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.) Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, 1996, pp.495–516.
K. Mehrotra, C.K. Mohan, and S. Ranka, Elements of Neural Networks, MIT Press, Cambridge, 1997.
M. Mehta, R. Agrawal, and J. Rissanen, “SLIQ: A fast scalable classifier for data mining,” in Proc. 1996 International Conf. on Extending Database Technology (EDBT′96), Avignon (France), March 1996.
R.S. Michalski, L. Kerschberg, K.A. Kaufman, and J.S. Ribiero, “Mining for knowledge in databases: The INLEN architecture, initial implementation and first results,” Journal of Intelligent Information Systems, 1992, 1:85–114.
M.C. Mozer, “Neural Net Architectures for Temporal Sequence Processing,” in A.S. Weigend and N.A. Gershenfeld (Eds.), Time Series Prediction: Forecasting the Future and Understanding the Past, Addison-Wesley, 1994.
G. Piatetsky-Shapiro, “Discovery, analysis and presentation of strong rules,” in G. Piatetsky-Shapiro and W.J. Frawley (Eds.), Knowledge Discovery in Databases, AAAI/MIT Press, 1991, pp.229–238.
G. Piatetsky-Shapiro and W.J. Frawley, Knowledge Discovery in Databases, AAAI/MIT Press, 1991.
G. Piatetsky-Shapiro, U. Fayyad, and P. Smyth, “From data mining to knowledge discovery: An overview,” in U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.) Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, 1996, pp.1–35.
P.G. Selfridge, D. Srivastava, and L.O. Wilson, “IDEA: Interactive data exploration and analysis,” in Proc. 1996 ACM-SIGMOD International Conf. Management of Data, Montreal (Canada), June 1996.
W. Shen, K. Ong, B. Mitbander, and C. Zaniolo, “Metaqueries for data mining,” in U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.) Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, 1996, pp.375–398.
R. Srikant and R. Agrawal, “Mining quantitative association rules in large relational tables,” in Proc. ACM SIGMOD International Conf, on Management of Data, 1996, pp.1–12.
H. Toivonen, “Sampling large databases for association rules,” in Proc. 22nd International Conf. on Very Large Data Bases, 1996, pp.134–145.
S.M. Weiss and N. Indurkhya, Predictive Data Mining: A Practical Guide, Morgan Kaufmann, 1998.
B. Widrow and M. Hoff, “Adaptive switching circuits,” in Western Electronic Show and Convention, Convention Record, Institute of Radio Engineers (now IEEE), 1960, 4:96–104.
T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: An efficient data clustering method for very large databases,” in Proc. ACM SIGMOD International Conf. on Management of Data, June 1996.
W. Zhang, “Mining fuzzy quantitative association rules,” in Proc.Eleventh International Conf. on Tools with Artificial Intelligence, Nov. 1999, pp. 99–102.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media New York
About this chapter
Cite this chapter
Mohan, C.K. (2000). Data Mining. In: Frontiers of Expert Systems. The Springer International Series in Engineering and Computer Science, vol 552. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-4509-5_9
Download citation
DOI: https://doi.org/10.1007/978-1-4615-4509-5_9
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7033-8
Online ISBN: 978-1-4615-4509-5
eBook Packages: Springer Book Archive