Abstract
In this research, we study several instance selection methods based on rough set theory and propose an approach able to deal with inconsistency caused by noise and imbalanced data. Recent attention has focused on the significant results obtained in selecting instances from noisy data using fuzzy-rough sets. For imbalanced data, fuzzy-rough sets approach is also applied before and after using balancing methods in order to improve classification performance. In this study, we propose an approach that uses different criteria for minority and majority classes in fuzzy-rough instance selection. It thus eliminates the step of using balancing techniques employed in controversial approach. We also carry out some experiments, measure classification performance and make comparisons with other methods.
Chapter PDF
Similar content being viewed by others
References
Alcala-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., Garcia, S.: Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Multiple-Valued Logic and Soft Computing 17(2–3), 255–287 (2011)
Bache, K., Lichman, M.U.: Machine Learning Repository (2013)
Caballero, Y., Bello, R., Alvarez, D., Gareia, M.M., Pizano, Y.: Improving the k-nn method: Rough set in edit training set. In: Debenham, J. (ed.) Professional Practice in Artificial Intelligence. IFIP, vol. 218, pp. 21–30. Springer, Heidelberg (2006)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
Cornelis, C., Verbiest, N., Jensen, R.: Ordered weighted average based fuzzy rough sets. In: Yu, J., Greco, S., Lingras, P., Wang, G., Skowron, A. (eds.) RSKT 2010. LNCS, vol. 6401, pp. 78–85. Springer, Heidelberg (2010)
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (1967)
Dubois, D., Prade, H.: Rough fuzzy sets and fuzzy rough sets. International Journal of General Systems 17, 191–209 (1990)
Dubois, D., Prade, H.: Putting rough sets and fuzzy sets together. In: Slowinski, R. (ed.) Intelligent Decision Support. Theory and Decision Library, vol. 11, pp. 203–232. Springer, Netherlands (1992)
Grzymala-Busse, J.W., Clark, P.G., Kuehnhausen, M.: Generalized probabilistic approximations of incomplete data. International Journal of Approximate Reasoning 55(1), Part 2, 180–196 (2014). Special issue on Decision-Theoretic Rough Sets
Huang, J., Ling, C.: Using auc and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering 17(3), 299–310 (2005)
Jensen, R., Cornelis, C.: Fuzzy-rough instance selection. In: 2010 IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1–7, July 2010. doi:10.1109/FUZZY.2010.5584791
Kryszkiewicz, M.: Rough set approach to incomplete information systems. Inf. Sci. 112(1–4), 39–49 (1998)
Lopez, V., Fernandez, A., Garcia, S., Palade, V., Herrera, F.: An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Information Sciences 250, 113–141 (2013)
Nguyen, D.V., Yamada, K., Unehara, M.: Extended tolerance relation to define a new rough set model in incomplete information systems. Advances in Fuzzy Systems, Article ID 372091 (2013)
Nguyen, D.V., Yamada, K., Unehara, M.: On probability of matching in probabilistiy based rough set definitions. In: IEEE-SMC2013, Manchester, The UK, pp. 449–454 (2013)
Nguyen, D.V., Yamada, K., Unehara, M.: Rough set approach with imperfect data based on dempster-shafer theory. Journal of Advanced Computational Intelligence and Intelligent Informatics 18(3), 280–288 (2014)
Nguyen, H.S.: Discretization problem for rough sets methods. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS (LNAI), vol. 1424, pp. 545–552. Springer, Heidelberg (1998)
Pawlak, Z.: Rough sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)
Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Acad. (1991)
Radzikowska, A.M., Kerre, E.E.: A comparative study of fuzzy rough sets. Fuzzy Sets Syst. 126(2), 137–155 (2002)
Ramentol, E., Caballero, Y., Bello, R., Herrera, F.: SMOTE-RSB *: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory. Knowl. Inf. Syst. 33(2), 245–265 (2011)
Ramentol, E., Verbiest, N., Bello, R., Caballero, Y., Cornelis, C., Herrera, F.: Smote-frst: a new resampling method using fuzzy rough set theory. In: Kahraman, C., Kerre, E., Bozbura, F.T. (eds.) World Scientific Proceedings Series on Computer Engineering and Decision Making, vol. 7, pp. 800–805. World Scientific (2012)
Rish, I.: An empirical study of the naive bayes classifier. Tech. rep. (2001)
Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: Slowinski, R. (ed.) Intelligent Decision Support. Theory and Decision Library, vol. 11, pp. 331–362. Springer, Netherlands (1992)
Strobl, C., Malley, J., Tutz, G. An introduction to recursive partitioning: Rationale, application and characteristics of classification and regression trees, bagging and random forests (2009)
Verbiest, N., Cornelis, C., Herrera, F.: Frps: A fuzzy rough prototype selection method. Pattern Recognition 46(10), 2770–2782 (2013)
Verbiest, N., Ramentol, E., Cornelis, C., Herrera, F.: Preprocessing noisy imbalanced datasets using SMOTE enhanced with fuzzy rough prototype selection. Appl. Soft Comput. 22, 511–517 (2014)
Yager, R.R.: On ordered weighted averaging aggregation operators in multicriteria decisionmaking. IEEE Trans. Syst. Man Cybern. 18(1), 183–190 (1988)
Yao, Y.Y.: Combination of rough and fuzzy sets based on-level sets. In: Rough Sets and Data Mining: Analysis for Imprecise Data, pp. 301–321. Kluwer Academic (1997)
Zimmermann, H.-J.: Fuzzy Set Theory and its Applications. Springer (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 IFIP International Federation for Information Processing
About this paper
Cite this paper
Van Nguyen, D., Ogawa, K., Matsumoto, K., Hashimoto, M. (2015). Editing Training Sets from Imbalanced Data Using Fuzzy-Rough Sets. In: Chbeir, R., Manolopoulos, Y., Maglogiannis, I., Alhajj, R. (eds) Artificial Intelligence Applications and Innovations. AIAI 2015. IFIP Advances in Information and Communication Technology, vol 458. Springer, Cham. https://doi.org/10.1007/978-3-319-23868-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-23868-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23867-8
Online ISBN: 978-3-319-23868-5
eBook Packages: Computer ScienceComputer Science (R0)