Study of Dimensionality Reduction Techniques for Effective Investment Portfolio Data Management
The aim of dimensionality reduction is to depict meaningful low-dimensional data of high-dimensional data set. Several new nonlinear methods have been proposed for last many years. But the question of their assessment is still open for the study. Dimensionality reduction is the vital problem in supervised and unsupervised learning. For high-dimensional data, computation becomes heavy if no pre-processing is done before supplying it to any of the classifiers. Because of the constraints like memory and speed, it is not suitable for certain practical applications. As per the method of attribute selection process, attribute sets are provided as an input to the classifier. The attributes that incorrectly classified are supposed to be irrelevant and are removed by obtaining the subset of selected attributes. Thus, accuracy of the classifier is improved, and time is also reduced. Attribute evaluators such as cfsSubset evaluator, information gain ranking filter, chi-squared ranking filter and gain ration feature evaluator are used for the classifiers viz. decision table, decision stump, J48, random forest. Individual investor’s investment portfolio data is used for the present study. Twenty-six attributes are obtained from the questionnaire. By applying dimensionality reduction techniques, five major attributes are obtained using information gain ranking filter, chi-squared ranking filter, gain ratio feature evaluation and seven attributes using cfsSubset evaluator. Around 70.7692% accuracy is obtained using three attribute evaluators for all five classification algorithms, whereas cfsSubset evaluator along with random forest classifier gives 81.5385% accuracy. It has been observed that cfsSubset evaluator with partition membership as a pre-processing technique and random forest as classification algorithm performs reasonably better in terms of accuracy.
KeywordsDimensionality reduction Classification cfsSubset evaluator Random forest Chi-squared ranking filter Information gain ranking filter
Authors have obtained permission to use the data from the investors. Authors take full responsibilities to bear any consequences if any issues arise due to this. Publisher or Editors are not responsible for this.
- 5.Hall, M.A., Smith, L.A.: Feature subset selection: a correlation based filter approach (1997)Google Scholar
- 9.Kaur, M., Vohra, T.: Understanding individual investor’s behavior: a review of empirical evidences. Pac. Bus. Int. 5(6), 10 (2012)Google Scholar
- 14.Weinberger, K.Q., Saul, L.K.: An introduction to nonlinear dimensionality reduction by maximum variance unfolding. In: AAAI, vol. 6, pp. 1683–1686 (2006)Google Scholar