Abstract
This paper addresses the classification problem with imperfect Data Streams. More precisely, it extends standard CVFDT to handle uncertainty in both building and classification procedures. Uncertainty here is represented by possibility distributions. The first part investigates the issue of building decision trees from Data Streams with uncertain attribute values by developing a non-specificity based information gain as the attribute selection measure which, in our case, is more appropriate than the standard selection measure based on Shannon entropy. The extended approach so-called Possibilistic Very Fast Decision Tree for Uncertain Data Streams (Poss-CVFDT) offers a more flexible building procedure. The second part addresses the classification phase. More specifically, it investigates the issue of predicting the class value of new instances presented with certain and/or uncertain attribute values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD00), 7180 (2000)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD01), 97106 (2001)
Zadeh, L.A.: Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst. 1, 328 (1978)
Higashi, M., Klir, G.J.: Measures of uncertainty and information based on possibility distributions. Int. J. General Syst. 4358 (1982)
Qin, B., Xia, Y., Li, F.: DTU: a decision tree for uncertain data. In: Advances in Knowledge Discovery and Data Mining. (PAKDD09), pp. 4–15 (2009)
Qin, B., Xia, Y., Prabhakar, S., Tu, Y.C.: A rule-based classification algorithm for uncertain data. In: IEEE International Conference on Data Engineering, USA. (ICDE09), pp. 1633–1640 (2009)
Tsang, S., Kao, B., Yip, KY., Ho, W.-S., Lee, S.D.: Decision trees for uncertain data. In: IEEE International Conference on Data Engineering 2009. (ICDE09), pp. 441–444 (2009)
Ge, J.A., Xia, Y., Nadungodage, C.H.: A neural network for uncertain data classification. Proceedings of the 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hyderabad, India, pp. 449–460 (2010)
Pan, S., Wu, k., Zhang, Y., Li, X.: Classifier ensemble for uncertain data stream classification. In: Proceedings of the 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Shenzhen, China, pp. 488–495 (2010)
Liang, C., Zhang, Y., Song, Q.: Decision tree for dynamic and uncertain data streams. JMLR: Workshop Conf. Proc. 13, 209–224 (2010)
Ghanem, T.M., Hammad, A.M., Mokbel, M.F., Aref, W.G., Elmagarmid, A.K.: Incremental evaluation of sliding-window queries over data streams. IEEE Trans. Knowl. Data Eng. 19(1), 57–72 (2007)
Borgelt, C., Kruse, R.: Operations and evaluation measures for learning possibilistic graphical models. Artif. Intell. 148, 385–418 (2003)
Borgelt, C., Gebhardt, J., Kruse, R.: Concepts for probabilistic and possibilistic induction of decision trees on real world data. In: Proceedings of the 4th European Congress on Intelligent Techniques and Soft Computing (EUFIT96), Aachen, Germany, vol. 3, Verlag Mainz, Aachen, pp. 1556–1560 (1996)
Hartley, R.V.L.: Transmission of information. Bell Syst. Tech. J. 7, 535–563 (1928)
Higashi, M., Klir, G.J.: Measures of uncertainty and information based on possibility distributions. Int. J. Gen. Syst. 9, 43–58 (1982)
Aggarwal, C.: Data streams: models and algorithms. Advances in database systems. (2007) Proccedings of the 24th International Conference on Data Engineering, pp. 150–159 (2008)
Noga, A., Yossi, M., Mario, S.: The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58(1), 137–147 (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Hamroun, M., Gouider, M.S. (2015). Possibilistic Very Fast Decision Tree for Uncertain Data Streams. In: Neves-Silva, R., Jain, L., Howlett, R. (eds) Intelligent Decision Technologies. IDT 2017. Smart Innovation, Systems and Technologies, vol 39. Springer, Cham. https://doi.org/10.1007/978-3-319-19857-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-19857-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19856-9
Online ISBN: 978-3-319-19857-6
eBook Packages: EngineeringEngineering (R0)