Abstract
Associative classifiers are relatively easy for people to understand and often outperform decision tree learners on many classification problems. Existing associative classifiers only work with certain data. However, data uncertainty is prevalent in many real-world applications such as sensor network, market analysis and medical diagnosis. And uncertainty may render many conventional classifiers inapplicable to uncertain classification tasks. In this paper, based on U-Apriori algorothm and CBA algorithm, we propose an associative classifier for uncertain data, uCBA (uncertain Classification Based on Associative), which can classify both certain and uncertain data. The algorithm redefines the support, confidence, rule pruning and classification strategy of CBA. Experimental results on 21 datasets from UCI Repository demonstrate that the proposed algorithm yields good performance and has satisfactory performance even on highly uncertain data.
This work is supported by the National Natural Science Foundation of China (60873196) and Chinese Universities Scientific Fund (QN2009092).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Singh, S., Mayfield, C., Prabhakar, S., Shah, R., Hambrusch, S.: Indexing Uncertain Categorical Data. In: Proc. of ICDE 2007, pp. 616–625 (2007)
Qin, B., Xia, Y., Prbahakar, S., Tu, Y.: A Rule-based Classification Algorithm for Uncertain Data. In: The Workshop on Management and Mining of Uncertain Data, MOUND (2009)
Qin, B., Xia, Y., Li, F.: DTU: A Decision Tree for Uncertain Data. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 4–15. Springer, Heidelberg (2009)
Chui, C.K., Kao, B., Hung, E.: Mining frequent itemsets from uncertain data. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 47–58. Springer, Heidelberg (2007)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: KDD, pp. 80–86 (1998)
Ziḿanyi, E., Pirotte, A.: Imperfect information in relational databases. In: Uncertainty Management in Information Systems, pp. 35–88 (1996)
Li, W., Han, J., Pei, J.: CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules. In: Proc. of ICDM 2001, pp. 369–380 (2001)
Yin, X., Han, J.: CPAR: Classification based on Predictive Association Rules. In: Proc. of SDM 2003, pp. 331–335 (2003)
Aggarwal, C.C., Yu, P.S.: A survey of Uncertain Data Algorithms and Applications. IEEE Transactions on Knowledge and Data Engineering 21(5), 609–623 (2009)
Tsang, S., Kao, B., Yip, K.Y., Ho, W.-S., Lee, S.D.: Decision Trees for Uncertain Data. In: Proc. of ICDE 2009, pp. 441–444 (2009)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proc. of 20th VLDB, pp. 487–499. Morgan Kaufmann, San Francisco (1994)
Chui, C., Kao, B.: A decremental approach for mining frequent itemsets from uncertain data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 64–75. Springer, Heidelberg (2008)
Leung, C.K.-S., Carmichael, C.L., Hao, B.: Efficient mining of frequent patterns from uncertain data. In: Proc. of ICDM Workshops, pp. 489–494 (2007)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD Record, pp. 1–12 (2000)
Aggarwal, C.C., Li, Y., Wang, J., Wang, J.: Frequent pattern mining with uncertain data. In: Proc. of KDD 2009, pp. 29–38 (2009)
Zhang, Q., Li, F., Yi, K.: Finding Frequent Items in Probabilistic Data. In: Proc. of SIGMOD 2008, pp. 819–832 (2008)
Bernecker, T., Kriegel, H.P., Renz, M., Verhein, F., Zuefle, A.: Probabilistic frequent itemset mining in uncertain databases. In: Proc. of SIGKDD 2009, pp. 119–128 (2009)
Weng, C.-H., Chen, Y.-L.: Mining fuzzy association rules from uncertain data. Knowledge and Information Systems (2009)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufman Publishers, San Francisco (1993)
Dietterich, T.: Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation 10(7), 1895–1923 (1998)
Bi, J., Zhang, T.: Support Vector Classification with Input Data Uncertainty. In: NIPS, pp. 161–168 (2004)
Ngai, W.K., Kao, B., Chui, C.K., Cheng, R., Chau, M., Yip, K.Y.: Efficient clustering of uncertain data. In: Perner, P. (ed.) ICDM 2006. LNCS (LNAI), vol. 4065, pp. 436–445. Springer, Heidelberg (2006)
Lee, S.D., Kao, B., Cheng, R.: Reducing UK-means to K-means. In: Proc. of ICDM Workshops, pp. 483–488 (2007)
Cormode, G., McGregor, A.: Approximation Algorithms for Clustering Uncertain Data. In: PODS 2008, pp. 191–200 (2008)
Aggarwal, C.C., Yu, P.S.: Outlier Detection with Uncertain Data. In: Jonker, W., Petković, M. (eds.) SDM 2008. LNCS, vol. 5159, pp. 483–493. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Qin, X., Zhang, Y., Li, X., Wang, Y. (2010). Associative Classifier for Uncertain Data. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds) Web-Age Information Management. WAIM 2010. Lecture Notes in Computer Science, vol 6184. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14246-8_66
Download citation
DOI: https://doi.org/10.1007/978-3-642-14246-8_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14245-1
Online ISBN: 978-3-642-14246-8
eBook Packages: Computer ScienceComputer Science (R0)