A feature selection strategy of E-nose data based on PCA coupled with Wilks Λ-statistic for discrimination of vinegar samples
- 20 Downloads
In order to enhance the correct discrimination rate of six kinds of vinegar samples using electronic nose (E-nose), a feature selection strategy based on principal component analysis (PCA) coupled with Wilks Λ-statistic is put forward. PCA is used to generate principal component (PC) variables so as to eliminate the correlation between original feature variables; then some PC variables that are beneficial to identify the vinegar samples are selected by Wilks Λ-statistic. Considering that each PC variable is a linear combination of all original feature variables, so the sum of absolute values of combination coefficients of one original feature variable to the selected PC variables can be calculate, and some different original feature variable sets are formed by the sums from large to small, and the best variable set can be determined by further exploring their discrimination results. By the strategy, 51 original feature variables were selected as representational features of the E-nose data. In order to verify the effectiveness of the feature selection strategy, Fisher discriminant analysis (FDA) and radial basis function neural network (RBFNN) were employed to discriminate the six kinds of vinegar samples, and correct discrimination rates of training sets were 94% and 97%, respectively, and the correct discrimination rates of corresponding test sets were 90% and 92% at least, respectively. Moreover, Bhattacharyya distance (B-distance) was employed to explain the separability of these vinegar samples and the reliability of the FDA and RBFNN results.
KeywordsFeature selection Vinegar Electronic nose Principal component analysis Wilks Λ-statistic
This work is supported by the National Natural Science Foundation of China (NSFC) under Grant No. 31571923, the authors acknowledge the support.
- 8.Z. Huang, C. Huang, J. Zhou, J. Li, G. Hui, Electronic nose system fabrication and application in large yellow croaker (Pseudosciaena crocea) fressness prediction. J. Food Meas. Charact. 1, 33–40 (2017)Google Scholar
- 10.X. Jing, W. Liu, G. Hui, J. Fu, E-nose based rapid prediction of early mouldy grain using probabilistic neural networks. Bioengineered 4, 222–226 (2015)Google Scholar
- 31.H. Gao, Applied Multivariate Statistical Analysis (Peking University Press, Beijing, 2005), pp. 63–66Google Scholar
- 34.L. Zhang, X. Li, Q. Tao, Feature Extraction and Classification for Hyperspectral Remote Sensing Images (Surveying and Mapping Press, Beijing, 2012), pp. 102–104Google Scholar