Feature Evaluation Metrics for Population Genomic Data

Kavakiotis, Ioannis; Triantafyllidis, Alexandros; Tsoumakas, Grigorios; Vlahavas, Ioannis

doi:10.1007/978-3-319-07064-3_36

Feature Evaluation Metrics for Population Genomic Data

Ioannis Kavakiotis²²,
Alexandros Triantafyllidis²³,
Grigorios Tsoumakas²² &
…
Ioannis Vlahavas²²

Conference paper

2716 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8445))

Abstract

Single Nucleotide Polymorphisms (SNPs) are considered nowadays one of the most important class of genetic markers with a wide range of applications with both scientific and economic interests. Although the advance of biotechnology has made feasible the production of genome wide SNP datasets, the cost of the production is still high. The transformation of the initial dataset into a smaller one with the same genetic information is a crucial task and it is performed through feature selection. Biologists evaluate features using methods originating from the field of population genetics. Although several studies have been performed in order to compare the existing biological methods, there is a lack of comparison between methods originating from the biology field with others originating from the machine learning. In this study we present some early results which support that biological methods perform slightly better than machine learning methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wilkinson, S., Wiener, P., Archibald, A., et al.: Evaluation of approaches for identifying population informative markers from high density SNP chips. BMC Genet. 12, 45 (2011)
Article Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach Learn Res. 3, 1157–1182 (2003)
MATH Google Scholar
Nielsen, E., Cariani, A., Mac Aoidh, E., et al.: Gene-associated markers provide tools for tackling illegal fishing and false eco-certification. Nat. Com. 3, 851 (2012), doi:10.1038/ncomms1845
Google Scholar
Wilkinson, S., Archibald, A., Haley, C., et al.: Development of a genetic tool for product regulation in the diverse British pig breed market. BMC Gen. 13, 580 (2012)
Article Google Scholar
Piry, S., Alapetite, A., Cornuet, J.M., Petkau, D., Baudouin, L., Estoup, A.: GENECLASS2: A software for genetic assignment and first generation migrant detection. J. Hered. 95, 536–539 (2004)
Article Google Scholar
Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, Burlington (2011)
Google Scholar
Shriver, M.D., Smith, M.W., Jin, L., et al.: Ethnic affiliation estimation by use of population-specific DNA markers. Am. J Hum. Genet. 60, 957–964 (1997)
Google Scholar
Wright, S.: The genetical structure of populations. Ann Eugenic 15, 323 (1951)
Article Google Scholar
Beebee, T., Rowe, G.: An Introduction to Molecular Ecology. Oxford University Press, Oxford (2004)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11, 10–18 (2009)
Article Google Scholar
Wang, Y., et al.: Gene selection from microarray data for cancer classification–a machine learning approach. Comput. Biol. Chem. 29, 37–46 (2005)
Article MATH Google Scholar
Robnik-Sikonja, M., Kononenko, I.: Theoretical and empirical analysis of relief and relieff. Mach. Lean. 53, 23–69 (2003)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Aristotle University of Thessaloniki, 54124, Greece
Ioannis Kavakiotis, Grigorios Tsoumakas & Ioannis Vlahavas
Department of Genetics, Development and Molecular Biology, School of Biology, Aristotle University of Thessaloniki, 54124, Greece
Alexandros Triantafyllidis

Authors

Ioannis Kavakiotis
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros Triantafyllidis
View author publications
You can also search for this author in PubMed Google Scholar
Grigorios Tsoumakas
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis Vlahavas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Ioannina, GR 45110, Ioannina, Greece
Aristidis Likas
Department of Computer Science, University of Ioannina, P.O. Box 1186, 45110, Ioannina, Greece
Konstantinos Blekas
Hellenic Open University, GR 26335, Peribola, Patras, Greece
Dimitris Kalles

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kavakiotis, I., Triantafyllidis, A., Tsoumakas, G., Vlahavas, I. (2014). Feature Evaluation Metrics for Population Genomic Data. In: Likas, A., Blekas, K., Kalles, D. (eds) Artificial Intelligence: Methods and Applications. SETN 2014. Lecture Notes in Computer Science(), vol 8445. Springer, Cham. https://doi.org/10.1007/978-3-319-07064-3_36

Download citation

DOI: https://doi.org/10.1007/978-3-319-07064-3_36
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07063-6
Online ISBN: 978-3-319-07064-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics