Comparing assignment-based approaches to breed identification within a large set of horses

Putnová, Lenka; Štohl, Radek

doi:10.1007/s13353-019-00495-x

Comparing assignment-based approaches to breed identification within a large set of horses

Animal Genetics • Original Paper
Published: 08 April 2019

Volume 60, pages 187–198, (2019)
Cite this article

Journal of Applied Genetics Aims and scope Submit manuscript

404 Accesses
9 Citations
10 Altmetric
1 Mention
Explore all metrics

Abstract

Considering the extensive data sets and statistical techniques, animal breeding embodies a branch of machine learning that has a constantly increasing impact on breeding. In our study, information regarding the potential of machine learning and data mining within a large set of horses and breeds is presented. The individual assignment methods and factors influencing the success rate of the procedure are compared at the Czech population scale. The fixation index values ranged from 0.057 (HMS1) to 0.144 (HTG6), and the overall genetic differentiation amounted to 8.9% among the breeds. The highest genetic divergence (F_ST = 0.378) was established between the Friesian and Equus przewalskii; the highest degree of gene migration was obtained between the Czech and Bavarian Warmblood (N_m = 14,302); and the overall global heterozygote deficit across the populations was 10.4%. The eight standard methods (Bayesian, frequency, and distance) using GeneClass software and almost all mainstream classification algorithms (Bayes Net, Naive Bayes, IB1, IB5, KStar, JRip, J48, Random Forest, Random Tree, PART, MLP, and SVM) from the WEKA machine learning workbench were compared by utilizing 314,874 real allelic data sets. The Bayesian method (GeneClass, 89.9%) and Bayesian network algorithm (WEKA, 84.8%) outperformed the other techniques. The breed genomic prediction accuracy reached the highest value in the cold-blooded horses. The overall proportion of individuals correctly assigned to a population depended mainly on the breed number and genetic divergence. These statistical tools could be used to assess breed traceability systems, and they exhibit the potential to assist managers in decision-making as regards breeding and registration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Correlation and variable importance in random forests

Article 23 March 2016

Optimizing selection based on BLUPs or BLUEs in multiple sets of genotypes differing in their population parameters

Article Open access 15 April 2024

Assessing measures of animal welfare

Article Open access 12 August 2022

References

Baudouin L, Lebrun P (2000) An operational bayesian approachfor the identification of sexually reproduced cross-fertilized populations using molecular markers. Acta Hortic 546:81–93. https://doi.org/10.17660/ActaHortic.2001.546.5
Article Google Scholar
Bjørnstad G, Røed KH (2002) Evaluation of factors affecting individual assignment precision using microsatellite data from horse breeds and simulated breed crosses. Anim Genet 33:264–270
Article PubMed Google Scholar
Cavalli-Sforza LL, Edwards AWF (1967) Phylogenetic analysis: models and estimation procedures. Am J Hum Genet 19:233–257
CAS PubMed PubMed Central Google Scholar
Cornuet JM, Piry S, Luikart G, Estoup A, Solignac M (1999) New methods employing multilocus genotypes to select or exclude populations as origins of individuals. Genetics 153:1989–2000
CAS PubMed PubMed Central Google Scholar
Dalvit C, De Marchi M, Dal Zotto R, Gervaso M, Meuwissen T, Cassandro M (2008) Breed assignment test in four Italian beef cattle breeds. Meat Sci 80:389–395
Article CAS PubMed Google Scholar
Fan B, Chen YZ, Moran C, Zhao SH, Liu B, Zhu MJ, Xiong TA, Li K (2005) Individual-breed assignment analysis in swine populations by using microsatellite markers. Asian Australas J Anim Sci 18:1529–1534
Article CAS Google Scholar
Goldstein DB, Ruiz Linares A, Cavalli-Sforza LL, Feldman MW (1995) Genetic absolute dating based on microsatellites and the origin of modern humans. Proc Natl Acad Sci U S A 92:6723–6727
Article CAS PubMed PubMed Central Google Scholar
Goodman SJ (1997) Rst Calc: a collection of computer programs for calculating estimates of genetic differentiation from microsatellite data and determining their significance. Mol Ecol 6:881–885
Article CAS Google Scholar
Goudet J (2001) FSTAT, a program to estimate and test gene diversities and fixation indices (version 2.9.3). Available from http://www.unil.ch/izea/softwares/fstat.html. Accessed 24 December 2017
Hauser L, Seamons TR, Dauer M, Naish KA, Quinn TP (2006) An empirical verification of population assignment methods by marking and parentage data: hatchery and wild steelhead (Oncorhynchus mykiss) in Forks Creek, Washington, USA. Mol Ecol 15:3157–3173
Article CAS PubMed Google Scholar
Iquebal MA, Sarika, Dhanda SK et al (2013) Development of a model webserver for breed identification using microsatellite DNA marker. BMC Genet 14:118
Article CAS PubMed PubMed Central Google Scholar
Iquebal MA, Ansari MS, Sarika DSP, Verma NK, Aggarwal RA, Jayakumar S, Rai A, Kumar D (2014) Locus minimization in breed prediction using artificial neural network approach. Anim Genet 45:898–902
Article CAS PubMed Google Scholar
Jaiswal S, Dhanda SK, Iquebal MA, Arora V, Shah TM, Angadi UB, Joshi CG, Raghava GPS, Rai A, Kumar D (2016) BIS-CATTLE: a web server for breed identification using microsatellite DNA markers. Curr Res Bioinforma 5:10–17
Article Google Scholar
Jamieson A, Taylor SCS (1997) Comparisons of three probability formulae for parentage exclusion. Anim Genet 28:397–400
Article CAS PubMed Google Scholar
Kalinowski ST, Taper ML, Marshall TC (2007) Revising how the computer program CERVUS accommodates genotyping error increases success in paternity assignment. Mol Ecol 16:1099–1106
Article PubMed Google Scholar
Koskinen M (2003) Individual assignment using microsatellite DNA reveals unambiguous breed identification in the domestic dog. Anim Genet 34:297–301
Article CAS PubMed Google Scholar
Liu K, Muse SV (2005) PowerMarker: integrated analysis environment for genetic marker data. Bioinformatics 21:2128–2129
Article CAS PubMed Google Scholar
Nei M (1972) Genetic distance between populations. Am Nat 106:283–291
Article Google Scholar
Nei M (1973a) The theory and estimation of genetic distances. In: Morton NE (ed) Genetic Structure of Populations. University Press of Hawaii, Honolulu
Google Scholar
Nei M (1973b) Analysis of gene diversity in subdivided populations. Proc Natl Acad Sci U S A 70:3321–3323
Article CAS PubMed PubMed Central Google Scholar
Nei M, Tajima F, Tateno Y (1983) Accuracy of estimated phylogenetic trees from molecular data. J Mol Evol 19:153–170
Article CAS PubMed Google Scholar
Paetkau D, Calvert W, Stirling I, Strobeck C (1995) Microsatellite analysis of population structure in Canadian polar bears. Mol Ecol 4:347–354
Article CAS PubMed Google Scholar
Pérez-Enciso M (2017) Animal breeding learning from machine learning. J Anim Breed Genet 134:85–86
Article PubMed Google Scholar
Piry S, Alapetite A, Cornuet JM, Paetkau D, Baudouin L, Estoup A (2004) GeneClass2: a software for genetic assignment and first-generation migrant detection. J Hered 95:536–539
Article CAS PubMed Google Scholar
Putnová L, Štohl R, Vrtková I (2018) Genetic monitoring of horses in the Czech Republic: a large-scale study with a focus on the Czech autochthonous breeds. J Anim Breed Genet 135:73–83
Article CAS PubMed Google Scholar
Rannala B, Mountain JL (1997) Detecting immigration by using multilocus genotypes. Proc Natl Acad Sci U S A 94:9197–9201
Article CAS PubMed PubMed Central Google Scholar
Rousset F (2008) Genepop'007: a complete reimplementation of the Genepop software for windows and Linux. Mol Ecol Resour 8:103–106
Article PubMed Google Scholar
Talle SB, Fimland E, Syrstad O, Meuwissen T, Klungland H (2005) Comparison of individual assignment methods and factors affecting assignment success in cattle breeds using microsatellites. Acta Agric Scand Sect A-Anim Sci 55:74–79
CAS Google Scholar
Van de Goor LH, van Haeringen WA, Lenstra JA (2011) Population studies of 17 equine STR for forensic and phylogenetic analysis. Anim Genet 42:627–633
Article PubMed Google Scholar
Van Oosterhout C, Hutchinson WF, Wills DPM, Shipley P (2004) MICRO-CHECKER: software for identifying and correcting genotyping errors in microsatellite data. Mol Ecol Notes 4:535–538
Article CAS Google Scholar
Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. Evolution 38:1358–1370
CAS PubMed Google Scholar

Download references

Acknowledgments

The authors would like to thank Professor Petr Hořín (Department of Animal Genetics, VFU Brno) for providing samples of the Camargue, Murgese, and Icelandic horses. This section would be incomplete without quoting Irena Vrtková, PhD (Laboratory of Agrogenomics) and her unwavering support over the years.

Funding

The research was funded by a project (NAZV QH92277) of the National Agency for Agricultural Research of the Ministry of Agriculture of the Czech Republic, utilizing the institutional support for the development of Mendel University in Brno. Furthermore, the research was supported by the Ministry of Education, Youth and Sports under project No. LO1210 solved at the Centre for Research and Utilization of Renewable Energy.

Author information

Authors and Affiliations

Laboratory of Agrogenomics, Department of Morphology, Physiology and Animal Genetics, Faculty of Agronomy, Mendel University in Brno, Zemědělská 1665/1, 613 00, Brno, Czech Republic
Lenka Putnová
Department of Control and Instrumentation, Faculty of Electrical Engineering and Communication, Brno University of Technology, Technická 3082/12, 616 00, Brno, Czech Republic
Radek Štohl

Authors

Lenka Putnová
View author publications
You can also search for this author in PubMed Google Scholar
Radek Štohl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lenka Putnová.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical statement

All procedures performed in studies involving animals were in accordance with the ethical standards of the institution or practice at which the studies were conducted.

Additional information

Communicated by: Maciej Szydlowski

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Putnová, L., Štohl, R. Comparing assignment-based approaches to breed identification within a large set of horses. J Appl Genetics 60, 187–198 (2019). https://doi.org/10.1007/s13353-019-00495-x

Download citation

Received: 20 November 2018
Accepted: 25 March 2019
Published: 08 April 2019
Issue Date: 01 May 2019
DOI: https://doi.org/10.1007/s13353-019-00495-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparing assignment-based approaches to breed identification within a large set of horses

Abstract

Access this article

Similar content being viewed by others

Correlation and variable importance in random forests

Optimizing selection based on BLUPs or BLUEs in multiple sets of genotypes differing in their population parameters

Assessing measures of animal welfare

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical statement

Additional information

Publisher’s note

Electronic supplementary material

Table S1

Table S2

Table S3

Table S4

Table S5

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparing assignment-based approaches to breed identification within a large set of horses

Abstract

Access this article

Similar content being viewed by others

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical statement

Additional information

Publisher’s note

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation