Abstract
The machine learning field, which can be briefly defined as enabling computers make successful predictions using past experiences, has exhibited an impressive development recently with the help of the rapid increase in the storage capacity and processing power of computers. Together with many other disciplines, machine learning methods have been widely employed in bioinformatics. The difficulties and cost of biological analyses have led to the development of sophisticated machine learning approaches for this application area. In this chapter, we first review the fundamental concepts of machine learning such as feature assessment, unsupervised versus supervised learning and types of classification. Then, we point out the main issues of designing machine learning experiments and their performance evaluation. Finally, we introduce some supervised learning methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
RapidMiner -- Data mining, ETL, OLAP, BI, http://sourceforge.net/projects/rapidminer/
scikit-learn: machine learning in Python, http://scikit-learn.org/stable/
The SHOGUN machine learning toolbox, http://www.shogun-toolbox.org/
Weka 3 - Data mining with open source machine learning software in Java, http://www.cs.waikato.ac.nz/ml/weka/
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Yousef M, Nebozhyn M, Shatkay H et al (2006) Combining multi-species genomic data for microRNA identification using a Naïve Bayes classifier. Bioinformatics 22:1325–1334
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. University of California Press, Los Angeles, CA, pp 281–297
Hastie T, Tibshirani R, Friedman JH (2003) The elements of statistical learning. Springer, New York, NY
Ng AY, Jordan MI, Weiss Y et al (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inform Process Syst 2:849–856
Chapelle O, Schölkopf B, Zien A (eds) (2010) Semi-supervised learning. The MIT Press, Cambridge, MA
Alpaydın E (2010) Introduction to machine learning. The MIT Press, Cambridge, MA
Bishop C (2006) Pattern recognition and machine learning. Springer, New York, NY
Bellman RE (1961) Adaptive control processes: a guided tour. Princeton University Press, Princeton, NJ
Liu H, Sun J, Liu L et al (2009) Feature selection with dynamic mutual information. Pattern Recogn 42:1330–1339
Chen Y-T, Chen MC (2011) Using chi-square statistics to measure similarities for text categorization. Expert Syst Appl 38:3085–3090
Lee C, Lee GG (2006) Information gain and divergence-based feature selection for machine learning-based text categorization. Inform Process Manag 42:155–165
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc B 39:1–38
Schlkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond. The MIT Press, Cambridge, MA
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this protocol
Cite this protocol
Baştanlar, Y., Özuysal, M. (2014). Introduction to Machine Learning. In: Yousef, M., Allmer, J. (eds) miRNomics: MicroRNA Biology and Computational Analysis. Methods in Molecular Biology, vol 1107. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-748-8_7
Download citation
DOI: https://doi.org/10.1007/978-1-62703-748-8_7
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-747-1
Online ISBN: 978-1-62703-748-8
eBook Packages: Springer Protocols