Abstract
This chapter presents an introduction to data mining with machine learning. It gives an overview of various types of machine learning, along with some examples. It explains how to download, install, and run the WEKA data mining toolkit on a simple data set, then proceeds to explain how one might approach a bioinformatics problem. Finally, it includes a brief summary of machine learning algorithms for other types of data mining problems, and provides suggestions about where to find additional information.
Key words
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, Burlington, MA
Ross Quinlan J (1993) C 4.5: programs for machine learning. Morgan Kaufmann, San Mateo, CA
Blom N, Sicheritz-Pontén T, Gupta R, Gammeltoft S, Brunak S (2004) Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics 4:1633–1649
Ramana J, Gupta D (2010) Machine learning methods for prediction of CDK-inhibitors. PLoS One 5(10):e13357
Buchwald F, Richter L, Kramer S (2011) Predicting a small molecule- kinase interaction map: a machine learning approach. J Cheminform 3:22
Fürnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13(1):3–54
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2-3):131–163
Domingos P, Pazzani M (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29(2–3):103–130
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(9):533–536
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inform Theory 13(1):21–27
Friedman JH, Bentley JL, Finkel RA (1977) An algorithm for finding best matches in logarithmic expected time. ACM Trans Math Softw 3(3):209–226
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. International conference on machine learning. Morgan Kaufmann, Bari, Italy
Dietterich TG (2000) Ensemble methods in machine learning. Multiple classifier systems. Springer, Berlin, pp 1–15
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Ting KM (1998) Inducing cost-sensitive trees via instance weighting. Principles of data mining and knowledge discovery. Springer, Berlin, pp 139–147
Duda RO, Hart PE (1973) Pattern classification and scene analysis, vol 3. Wiley, New York
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324
Hartigan JA (1975) Clustering algorithms. Wiley, New York
Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254
McLachlan GJ, Basford KE (1987) Mixture models: inference and applications to clustering. CRC, New York
Rakesh A, Srikant R (1994) Fast algorithms for mining association rules. International conference on very large databases. Morgan Kaufmann, Santiago de Chile, Chile
Ihaka R, Gentleman R (1996) R: a language for data analysis and graphics. J Comput Graph Stat 5(3):299–314
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this protocol
Cite this protocol
Smith, T.C., Frank, E. (2016). Introducing Machine Learning Concepts with WEKA. In: Mathé, E., Davis, S. (eds) Statistical Genomics. Methods in Molecular Biology, vol 1418. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3578-9_17
Download citation
DOI: https://doi.org/10.1007/978-1-4939-3578-9_17
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-3576-5
Online ISBN: 978-1-4939-3578-9
eBook Packages: Springer Protocols