Statistical Methods for Data Mining and Knowledge Discovery

Vaillancourt, Jean

doi:10.1007/978-3-642-11928-6_4

Statistical Methods for Data Mining and Knowledge Discovery

Jean Vaillancourt²¹

Conference paper

1065 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5986))

Abstract

This survey paper aims mainly at giving computer scientists a rapid bird’s eye view, from a mathematician’s perspective, of the main statistical methods used in order to extract knowledge from databases comprising various types of observations. After touching briefly upon the matters of supervision, data regularization and a brief review of the main models, the key issues of model assessment, selection and inference are perused. Finally, specific statistical problems arising from applications around data mining and warehousing are explored. Examples and applications are chosen mainly from the vast collection of image and video retrieval, indexation and classification challenges facing us today.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer Series in Statistics (2001)
Google Scholar
Friedman, J.H.: Data Mining and Statistics: What’s the Connection? Keynote presentation at 29th Symposium on Interface: Computer Science and Statistics (1997), http://www-stat.stanford.edu/~jhf/
Hand, D.: Classifier technology and the illusion of progress. Statist. Sci. 21(1), 1–14 (2006)
Article MATH MathSciNet Google Scholar
Friedman, J.H.: Comment on classifier technology and the illusion of progress. Statist. Sci. 21(1), 15–18 (2006)
Article Google Scholar
Sarifuddin, M., Missaoui, R., Vaillancourt, J., Hamouda, Y., Zaremba, M.: Analyse statistique de similarité dans une collection d’images. Revue des Nouvelles Technologies de l’Information 1(1), 239–250 (2003)
Google Scholar
Kherfi, M.L., Ziou, D., Bernardi, A.: Image retrieval from the world wide web: issues, techniques and systems. ACM Computing Surveys 36(1), 35–67 (2004)
Article Google Scholar
Ganter, B., Wille, R.: Formal concept analysis, mathematical foundations. Springer, Heidelberg (1999)
MATH Google Scholar
Valtchev, P., Missaoui, R., Godin, R.: Formal concept analysis for knowledge and data discovery: new challenges. In: Proc. Second Int. Conf. Formal Concept Analysis, Sydney, Australia, pp. 352–371 (2004)
Google Scholar
Solo, V.: Topics in advanced time series analysis. Lecture notes in mathematics, vol. 1215, pp. 165–328. Springer, Heidelberg (1986)
Google Scholar
Cremers, D.: Bayesian approach to motion-based image and video segmentation. In: Jähne, B., Mester, R., Barth, E., Scharr, H. (eds.) IWCM 2004. LNCS, vol. 3417, pp. 104–123. Springer, Heidelberg (2007)
Chapter Google Scholar
Wahba, G.: Spline models for observational data. SIAM, Philadelphia (1990)
MATH Google Scholar
Daubechies, I.: Ten lectures on wavelets. SIAM, Philadelphia (1992)
MATH Google Scholar
Graffigne, C., Heitz, F., Perez, P., Preteux, F.J.: Hierarchical Markov random field models applied to image analysis: a review. In: Proc. SPIE, vol. 2568, pp. 2–17 (1995)
Google Scholar
Graffigne, C.: Stochastic modeling in image segmentation. In: Proc. SPIE, vol. 3457, pp. 251–262 (1998)
Google Scholar
Bentabet, L., Jodouin, S., Ziou, D., Vaillancourt, J.: Road vectors update using SAR imagery: a snake-based approach. IEEE Trans. on Geoscience and Remote Sensing 41(8), 1785–1803 (2003)
Article Google Scholar
Jodouin, S., Bentabet, L., Ziou, D., Vaillancourt, J., Armenakis, C.: Spatial database updating using active contours for multi-spectral images: application with Landsat 7. ISPRS J. of Photogrammetry and Remote Sensing 57, 346–355 (2003)
Article Google Scholar
Grenander, U.: Lectures in pattern theory, vol. I, II and III. Springer, New York (1981)
Google Scholar
Geman, D., Geman, S.: Stochastic relaxation, Gibbs distributions and the bayesian restoration of images. IEEE Trans. Pattern Anal. Math. Intell. 6(6), 721–741 (1984)
Article MATH Google Scholar
Besag, J.: Spatial interaction and the statistical analysis of lattice systems. J. Roy. Statist. Soc., B 36, 192–236 (1974)
MATH MathSciNet Google Scholar
Besag, J.: On the statistical analysis of dirty pictures. J. Roy. Statist. Soc., B 48, 259–302 (1986)
MATH MathSciNet Google Scholar
Gibbs, A.L.: Bounding the convergence time of the Gibbs sampler in Bayesian image restoration. Biometrika 87(4), 749–766 (2000)
Article MATH MathSciNet Google Scholar
DeGraaf, S.R.: SAR imaging via modern 2-D spectral estimation methods. IEEE Trans. on Image Processing 7(5), 729–761 (1998)
Article MATH MathSciNet Google Scholar
Bouguila, N., Ziou, D., Vaillancourt, J.: Unsupervised learning of a finite mixture model based on the Dirichlet distributions and its applications. IEEE Trans. Image Processing 13(11), 1533–1543 (2004)
Article Google Scholar
Walther, G.: Multiscale maximum likelihood analysis of a semiparametric model, with application. Ann. Stastist. 29(5), 1297–1319 (2001)
Article MATH MathSciNet Google Scholar
Severini, T.: Likelihood methods in statistics. Oxford Univ. Press, Oxford (2001)
Google Scholar
Berger, J.O.: Statistical decision theory and bayesian analysis. Springer, Heidelberg (1980)
Google Scholar
Prakasa Rao, B.L.S.: Asymptotic theory of statistical inference. John Wiley, Chichester (1987)
MATH Google Scholar
Amit, Y., Geman, D.: A computational model for visual selection. Neural Computation 11, 1691–1715 (1998)
Article Google Scholar
Amit, Y., Trouvé, A.: POP: Patchwork of parts models for object recognition. Intern. J. Comp. Vision 75(2), 267–282 (2007)
Article Google Scholar
Missaoui, R., Sarifuddin, M., Vaillancourt, J.: Similarity measures for an efficient content-based image retrieval. In: IEE Proc. Vision, Image and Signal Processing, vol. 152(6), pp. 875–887 (2005)
Google Scholar
Devroye, L.: A course in density estimation. Birkhauser Verlag, Basel (1987)
MATH Google Scholar
Efron, B.: The jackknife, the bootstrap and other resampling plans. SIAM, Philadelphia (1982)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
MATH MathSciNet Google Scholar
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Article MATH Google Scholar
Besse, P., Le Gall, C., Raimbault, N., Sarpy, S.: Data mining et statistique, avec discussion. Journal de la Société Francaise de Statistique 142, 5–35 (2001)
Google Scholar
Tukey, J.W.: Exploratory data analysis. Addison-Wesley, Reading (1977)
MATH Google Scholar
Benzécri, J.P.: Histoire et préhistoire de l’analyse des données. Dunod (1982)
Google Scholar
Genest, C., Rémillard, B.: Comments on T. Mikosh’s paper Copulas: tales and fact. Extremes 9, 27–36 (2006)
Article MathSciNet Google Scholar
Bouguila, N.: A model based approach for discrete data clustering and feature weighting using MAP and stochastic complexity. IEEE Trans. Knowledge and Data Engineering 21(12), 1649–1664 (2009)
Article Google Scholar
Gras, R., Kuntz, P.: An overview of the statistical implicative analysis (SIA) development. Studies in computational intelligence, vol. 127, pp. 11–40 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

UQO, Gatineau, QC, J8X 3X7, Canada
Jean Vaillancourt

Authors

Jean Vaillancourt
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering, Zurich University of Applied Sciences, Technikumstraße 9, 8401, Winterthur, Switzerland
Léonard Kwuida
SAP Research Center, Chemnitzer Straße 48, 01187, Dresden, Germany
Barış Sertkaya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vaillancourt, J. (2010). Statistical Methods for Data Mining and Knowledge Discovery. In: Kwuida, L., Sertkaya, B. (eds) Formal Concept Analysis. ICFCA 2010. Lecture Notes in Computer Science(), vol 5986. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11928-6_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-11928-6_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11927-9
Online ISBN: 978-3-642-11928-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics