Abstract
In machine learning, there is often a large range of possible features to use for classification into groups. This chapter concentrates on methods of feature selection to narrow down characteristics of interest to create more parsimonious and cost-effective models. Aspects of feature selection such as choice of method (wrapper, embedded, and filter), evaluation functions used to identify an optimal subset of features, and validation of model fit are described. Worked examples using a random forest algorithm in R for classification are presented, which introduces diagnostics to show how the most important classification features are selected. Feature selection is then considered for a specific set of models that use algorithms that treat the data as genetic information based upon pairs of chromosomes. These models incorporate concepts in genetic models such as parents, children, reproduction, and mutation. An example of the use of this genetic approach to feature selection in machine learning is illustrated in R using two 10-item subscales from a questionnaire measuring sexual pain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Adenine
- 2.
Cytosine
- 3.
Guanine
- 4.
Thymine
References
Agrawal, P., Abutarboush, H. F., Ganesh, T., & Mohamed, A. W. (2021). Metaheuristic algorithms on feature selection: A survey of one decade of research (2009–2019). IEEE Access, 9, 26766–26791.
Ansari, N. (2021). A survey on feature selection techniques using evolutionary algorithms. Iraqi Journal of Science, 2796–2812.
Basarkod, G., Sahdra, B., & Ciarrochi, J. (2018). Body image–acceptance and action questionnaire–5: An abbreviation using genetic algorithms. Behavior Therapy, 49(3), 388–402.
Cunningham, P., Kathirgamanathan, B., & Delany, S. J. (2021). Feature selection tutorial with python examples. arXiv preprint arXiv, 2106.06437.
Dash, M., & Liu, H. (1997). Feature selection for classification. Intelligent Data Analysis, 1(1-4), 131–156.
Eisenbarth, H., Lilienfeld, S. O., & Yarkoni, T. (2015). Using a genetic algorithm to abbreviate the Psychopathic Personality Inventory–Revised (PPI-R). Psychological Assessment, 27(1), 194.
Engelbrecht, A. P. (2007). Computational intelligence: An introduction. Wiley.
Fan, W., & Yan, Z. (2010). Factors affecting response rates of the web survey: A systematic review. Computers in Human Behavior, 26(2), 132–139.
Fraser, A. S. (1957). Simulation of genetic systems by automatic digital computers I. Introduction. Australian Journal of Biological Sciences, 10(4), 484–491.
Golberg, D. E. (1989). Genetic algorithms in search, optimization, and machine learning. Addion Wesley, 1989(102), 36.
Goldberg, D. E., & Richardson, J. (1987, July). Genetic algorithms with sharing for multimodal function optimization. In Genetic algorithms and their applications: Proceedings of the second international conference on genetic algorithms (Vol. 4149). Lawrence Erlbaum.
Higashi, N., & Iba, H. (2003, April). Particle swarm optimization with Gaussian mutation. In Proceedings of the 2003 IEEE swarm intelligence symposium. SIS’03 (Cat. No. 03EX706) (pp. 72–79). IEEE.
Kursa, M. B., & Rudnicki, W. R. (2010). Feature selection with the Boruta package. Journal of Statistical Software, 36(11), 1–13. https://doi.org/10.18637/jss.v036.i11
Noetel, M., Ciarrochi, J., Sahdra, B., & Lonsdale, C. (2019). Using genetic algorithms to abbreviate the mindfulness inventory for sport: A substantive-methodological synthesis. Psychology of Sport and Exercise, 45, 101545.
Rolstad, S., Adler, J., & Rydén, A. (2011). Response burden and questionnaire length: Is shorter better? A review and meta-analysis. Value in Health, 14(8), 1101–1108.
Sadeghian, F., Hasani, H., & Jafari, M. (2021). Feature selection based on genetic algorithm in the diagnosis of autism disorder by fMRI. Caspian Journal of Neurological Sciences, 7(2), 74–83.
Sahdra, B. K., Ciarrochi, J., Parker, P., & Scrucca, L. (2016). Using genetic algorithms in a large nationally representative American sample to abbreviate the multidimensional experiential avoidance questionnaire. Frontiers in Psychology, 7, 189.
Sandy, C. J., Gosling, S. D., & Koelkebeck, T. (2014). Psychometric comparison of automated versus rational methods of scale abbreviation. Journal of Individual Differences, 35, 221.
Tang, J., Alelyani, S., & Liu, H. (2014). Feature selection for classification: A review. Data Classification: Algorithms and Applications, 37.
Wang, L., & Jiang, T. (1994). On the complexity of multiple sequence alignment. Journal of Computational Biology, 1(4), 337–348.
Yarkoni, T. (2010). The abbreviation of personality, or how to measure 200 personality scales with 200 items. Journal of Research in Personality, 44(2), 180–198.
Scrucca, L. (2013). GA: A Package for Genetic Algorithms in R. Journal of Statistical Software, 53(4), 1–37. https://doi.org/10.18637/jss.v053.i04
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Farahani, H., Blagojević, M., Azadfallah, P., Watson, P., Esrafilian, F., Saljoughi, S. (2023). Feature Selection in AP. In: An Introduction to Artificial Psychology. Springer, Cham. https://doi.org/10.1007/978-3-031-31172-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-31172-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31171-0
Online ISBN: 978-3-031-31172-7
eBook Packages: Behavioral Science and PsychologyBehavioral Science and Psychology (R0)