Abstract
The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is considered a vital process. The data analysis process consists of different tasks, among which the data stream classification approaches face more challenges than the other commonly used techniques. Even though the classification is a continuous process, it requires a design that can adapt the classification model so as to adjust the concept change or the boundary change between the classes. Hence, we design a novel fuzzy classifier known as THRFuzzy to classify new incoming data streams. Rough set theory along with tangential holoentropy function helps in the designing the dynamic classification model. The classification approach uses kernel fuzzy c-means (FCM) clustering for the generation of the rules and tangential holoentropy function to update the membership function. The performance of the proposed THRFuzzy method is verified using three datasets, namely skin segmentation, localization, and breast cancer datasets, and the evaluated metrics, accuracy and time, comparing its performance with HRFuzzy and adaptive k-NN classifiers. The experimental results conclude that THRFuzzy classifier shows better classification results providing a maximum accuracy consuming a minimal time than the existing classifiers.
Similar content being viewed by others
References
ROSS G J, TASOULIS D K, ADAMS N M. Nonparametric monitoring of data streams for changes in location and scale [J]. Technometr, 2012, 53(4): 379–389.
BRZEZINSKI D, STEFANOWSKI J. Reacting to different types of concept drift: the accuracy updated ensemble algorithm [J]. IEEE Transactions on Neural Networks and Learning Systems, 2014, 25(1): 81–94.
ZHU X, ZHANG P, LIN X, SHI Y. Active learning from stream data using optimal weight classifier ensemble [J]. IEEE Trans System, Man, Cybernetics, Part B: Cybernetics, 2010, 40(4): 1–15.
BIFET A, HOLMES G, PFAHRINGER B, KIRKBY R, GAVALDA R. New ensemble methods for evolving data streams [C]// Proc 15th ACM SIGKDD Int’l Conf Knowledge Discovery and Data Mining (KDD), 2009: 139–148.
MASUD M, GAO J, KHAN L, HAN J, THURAISINGHAM B. Classification and novel class detection in concept-drifting data streams under time constraints [J]. IEEE Trans Knowledge and Data Eng., 2011, 23(6): 859–874.
ZHANG Peng, ZHOU Chuan, WANG Peng, GAO B J, ZHU Xing-quan, GUO Li. E-Tree: An efficient indexing structure for ensemble models on data streams [J]. IEEE Transactions on Knowledge and Data engineering, 2015, 27(2): 461–474.
RUTKOWSKI L, JAWORSKI M, PIETRUCZUK L, DUDA P. Decision trees for mining data streams based on the gaussian approximation [J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(1): 108–119.
FAN W. Systematic data selection to mine concept-drifting data streams [C]// Proc ACM SIGKDD 10th Int’l Conf Knowledge Discovery and Data Mining. 2004: 128–137.
MENA-TORRES D, AGUILAR-RUIZ J S. A similarity-based approach for data stream classification [J]. Expert Systems with Applications, 2014, 41: 4224–4234.
HULTEN G, SPENCER L, DOMINGOS P. Mining time-changing data streams [C]// Proc ACM SIGKDD Seventh Int’l Conf Knowledge Discovery and Data Mining. 2001: 97–106.
GAO J, FAN W, HAN J. On appropriate assumptions to mine data streams [C]// Proc IEEE Seventh Int’l Conf Data Mining (ICDM). 2007: 143–152.
KOLTER J, MALOOF M. Using additive expert ensembles to cope with concept drift [C]// Proc 22nd Int’l Conf Machine Learning (ICML). 2005: 449–456.
WANG H, FAN W, YU P S, HAN J. Mining concept-drifting data streams using ensemble classifiers [C]// Proc ACM SIGKDD Ninth Int’l Conf Knowledge Discovery and Data Mining, 2003: 226–235.
KATAKIS I, TSOUMAKAS G, VLAHAVAS I. On the utility of incremental feature selection for the classification of textual data streams [C]// Advances in Informatics. New York, NY, USA: Springer-Verlag, 2005: 338–348.
GOMES J B, SOUSA P A C, MENASALVAS E. Tracking recurrent concepts using context [C]// Proc 7th Int Conf RSCTC. 2010: 168–177.
GAMA J, KOSINA P. Tracking recurring concepts with metalearners [C]// In Proc 14th Portuguese Conf Artif Intell. 2009: 423.
YANG Y, WU X, ZHU X. Mining in anticipation for concept change: Proactive-reactive prediction in data streams [J]. Data Mining Knowl Discovery, 2006, 13(3): 261–289.
HARSHE M S MANASI V. Outlier detection using weighted holoentropy [J]. International Journal of Advances in Engineering Science and Technology, 2016, 5(1): 52–58.
PAWLAK Z. Rough sets [J]. International Journal of Parallel Programming, 1982, 11(5): 341–356.
ZADEH L. Fuzzy sets. [M]. Fuzzy models for pattern recognition: Methods that search for structures in data. NY: IEEE Press, 1992.
THAKUR P, LINGAM C. Generalized spatial kernel based fuzzy c-means clustering algorithm for image segmentation [J]. International Journal of Science and Research (IJSR), 2013, 2(5): 165–169.
ALIPPI C, LIU De-rong, ZHAO Dong-bin, LI Bu. Detecting and reacting to changes in sensing units: The active classifier case [J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2013, 44(3): 353–362.
GOMES J B, GABER M M, PEDRO A, SOUSA C. Mining recurring concepts in a dynamic feature space [J]. IEEE Transactions on Neural Networks and Learning Systems, 2014, 25(1): 95–110.
MASUD M M, CHEN Q, KHAN L, AGGARWAL C C, GAO J, HAN J W, SRIVASTAVA A, OZA N C. Classification and adaptive novel class detection of feature-evolving data streams [J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(7): 1484–1497.
ABDULSALAM H, SKILLICORN D B, MARTIN P. Classification using streaming random forests [J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(1): 22–36.
LICHMAN M. UC Irvine Machine Learning Repository [EB/OL]. [2016−01−10]. http://archive. ics.uci. edu/ml/datasets.html.
Acknowledgment
This work is supported by proposal No. OSD/BCUD/392/197 Board of Colleges and University Development, Savitribai Phule Pune University, Pune.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nalavade, J.E., Murugan, T.S. THRFuzzy: Tangential holoentropy-enabled rough fuzzy classifier to classification of evolving data streams. J. Cent. South Univ. 24, 1789–1800 (2017). https://doi.org/10.1007/s11771-017-3587-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11771-017-3587-5