Data Mining: Algorithms and Problems

Wu, Xindong

doi:10.1007/978-3-540-87656-4_1

Xindong Wu⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5271))

Included in the following conference series:

International Workshop on Hybrid Artificial Intelligence Systems

1606 Accesses

Abstract

Data mining, or knowledge discovery in databases (KDD), is an interdisciplinary field that integrates techniques from several research areas including machine learning, statistics, database systems, and pattern recognition, for the analysis of large volumes of possibly complex, highly-distributed and poorly-organized data. The prosperity of the data mining field may attribute to two essential reasons. Firstly, a huge amount of data is collected and stored everyday. On the one hand, along with the continuing development of advanced technologies in many domains, data is generated at enormous speeds. For examples, purchases data at department/grocery stores, bank/credit card transaction data, e-commerce data, Internet traffic data that describes the browsing history of Web users, remote sensor data from agricultural satellites, and gene expression data from microarray technology. On the other hand, the progress made in hardware technology allows today’s computer systems to store very large amounts of data. Secondly, with these large volumes of data at hand, the data owners have an imminent intent to turn them into useful knowledge. From a commercial viewpoint, the ultimate goal of the data owners is to gain more and pay less for their business activities. Under the competition pressure, they want to enhance their services, develop cost-effective strategies, and target the right group of potential customers. From a scientific viewpoint, when traditional techniques are infeasible in dealing with the raw data, data mining may help scientists in many ways, such as classifying and segmenting data. By applying the knowledge extracted from data mining, the business analyst may rate customers by their propensity to respond to an offer, the doctor may estimate the probability of an illness re-occurrence, the website publisher may display customized Web pages to individual Web users according to their browsing habit, and the geneticist may discover novel gene-gene interaction patterns. In this talk, we aim to provide a general picture for important data mining steps, topics, algorithms and challenges.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Department of Computer Science, University of Vermont, 33 Colchester Avenue, Burlington, Vermont, 05405, USA
Xindong Wu

Authors

Xindong Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Escuela Politécnica Superior, GICAP Research Group, Universidad de Burgo, Calle Francisco de Vitoria S/N, Edifico C, Campus Vena, 09006, Burgos, Spain
Emilio Corchado
Center of Excellence for Quantifiable Quality of Service, Norwegian University of Science and Technology, 7491, Trondheim, Norway
Ajith Abraham
Department of Electrical and Computer Engineering, University of Alberta, Edmonton,, T6G 2V4, Alberta, Canada
Witold Pedrycz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, X. (2008). Data Mining: Algorithms and Problems. In: Corchado, E., Abraham, A., Pedrycz, W. (eds) Hybrid Artificial Intelligence Systems. HAIS 2008. Lecture Notes in Computer Science(), vol 5271. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87656-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-540-87656-4_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87655-7
Online ISBN: 978-3-540-87656-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics