Study on the Impact of Affinity on the Results of Data Mining in Biological Populations
In biological populations genetic correlations between individuals are the result of genetic relatedness. In its standard form, the data is not stored in a way that lets users easily take into account the information in the processes of data mining. The aim of this study was to verify whether and to what extent inclusion of this additional information (in the form of grandparents and great grandparents of data) affects the results of data mining. This paper is one of the stages of interdisciplinary research project investigating a population of Silesian horses. The database contains breeding history of roughly the complete population of Silesian horses bred in Poland over the last 50 years. Tests were conducted with a subset of individuals known to their parents due to the assumption that we try to predict characteristics of offspring, knowing the characteristics of ancestors (parents, grandparents, great grandparents).
Keywordsgenetic dependences in data mining biological population data base prediction
Unable to display preview. Download preview PDF.
- 4.Dataset of the Silesian Horses Population - An Extract from the Database The Protected Information are Omitted, such as Personal Data of Breeders, http://www.silesian.pwr.wroc.pl/database.html
- 6.Unold, O., Dobrowolski, M., Maciejewski, H., Skrobanek, P., Walkowicz, E.: A GA-Based Wrapper Feature Selection for Animal Breeding Data Mining. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, S.-B. (eds.) HAIS 2012. LNCS, vol. 7209, pp. 200–209. Springer, Heidelberg (2012)CrossRefGoogle Scholar