Skip to main content

Improving Classification Accuracy on Uncertain Data by Considering Multiple Subclasses

  • Conference paper
AI 2012: Advances in Artificial Intelligence (AI 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7691))

Included in the following conference series:

Abstract

We study the problem of classification on uncertain objects whose locations are uncertain and described by probability density functions (pdf). Though there exist some classification algorithms proposed to handle uncertain objects, all existing algorithms are complex and time consuming. Thus, a novel supervised UK-means algorithm is proposed to classify uncertain objects more efficiently. Supervised UK-means assumes the classes are well separated. However, in real data, subsets of objects of the same class are usually interspersed among (disconnected by) other classes. Thus, we proposed a new algorithm Supervised UK-means with Multiple Subclasses (SUMS) which considers the objects in the same class can be further divided into several groups (subclasses) within the class and tries to learn the subclass representatives to classify objects more accurately.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Barbará, D., Garcia-Molina, H., Porter, D.: The management of probabilistic data. IEEE Trans. Knowl. Data Eng. 4(5), 487–502 (1992)

    Article  Google Scholar 

  2. Bi, J., Zhang, T.: Support vector classification with input data uncertainty. In: NIPS (2004)

    Google Scholar 

  3. Chau, M., Cheng, R., Kao, B., Ng, J.: Uncertain Data Mining: An Example in Clustering Location Data. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 199–204. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Cheng, R., Kalashnikov, D.V., Prabhakar, S.: Querying imprecise data in moving object environments. IEEE Trans. Knowl. Data Eng. 16(9), 1112–1127 (2004)

    Article  Google Scholar 

  5. Dalvi, N.N., Suciu, D.: Efficient query evaluation on probabilistic databases. VLDB J. 16(4), 523–544 (2007)

    Article  Google Scholar 

  6. Dasgupta, S.: Experiments with random projection. In: UAI, pp. 143–151 (2000)

    Google Scholar 

  7. Feng, Y., Hamerly, G.: Pg-means: learning the number of clusters in data. In: NIPS, pp. 393–400 (2006)

    Google Scholar 

  8. Hamerly, G., Elkan, C.: Learning the k in k-means. In: NIPS (2003)

    Google Scholar 

  9. Kao, B., Lee, S.D., Cheung, D.W., Ho, W.-S., Chan, K.F.: Clustering uncertain data using voronoi diagrams. In: ICDM, pp. 333–342 (2008)

    Google Scholar 

  10. Ngai, W.K., Kao, B., Chui, C.K., Cheng, R., Chau, M., Yip, K.Y.: Efficient clustering of uncertain data. In: ICDM 2006, pp. 436–445 (2006)

    Google Scholar 

  11. Pelleg, D., Moore, A.W.: X-means: Extending k-means with efficient estimation of the number of clusters. In: ICML, pp. 727–734 (2000)

    Google Scholar 

  12. Qin, B., Xia, Y., Prabhakar, S., Tu, Y.-C.: A rule-based classification algorithm for uncertain data. In: ICDE, pp. 1633–1640 (2009)

    Google Scholar 

  13. Ren, J., Lee, S.D., Chen, X., Kao, B., Cheng, R., Cheung, D.W.-L.: Naive bayes classification of uncertain data. In: ICDM, pp. 944–949 (2009)

    Google Scholar 

  14. Ruspini, E.H.: A new approach to clustering. Information and Control 15(1), 22–32 (1969)

    Article  MATH  Google Scholar 

  15. Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley (2005)

    Google Scholar 

  16. Tsang, S., Kao, B., Yip, K.Y., Ho, W.-S., Lee, S.D.: Decision trees for uncertain data. In: ICDE, pp. 441–444 (2009)

    Google Scholar 

  17. Tsang, S., Kao, B., Yip, K.Y., Ho, W.-S., Lee, S.D.: Decision trees for uncertain data. IEEE Trans. Knowl. Data Eng. 23(1), 64–78 (2011)

    Article  Google Scholar 

  18. Wang, D., Kim, Y.-S., Park, S.C., Lee, C.S., Han, Y.K.: Learning based neural similarity metrics for multimedia data mining. Soft Comput. 11(4), 335–340 (2007)

    Article  Google Scholar 

  19. Wang, D., Ma, X.: Learning Pseudo Metric for Multimedia Data Classification and Retrieval. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds.) KES 2004. LNCS (LNAI), vol. 3213, pp. 1051–1057. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  20. Welling, M., Kurihara, K.: Bayesian k-means as a ”maximization-expectation” algorithm. In: SDM (2006)

    Google Scholar 

  21. Xu, L., Hung, E.: Distance-Based Feature Selection on Classification of Uncertain Objects. In: Wang, D., Reynolds, M. (eds.) AI 2011. LNCS, vol. 7106, pp. 172–181. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xu, L., Hung, E. (2012). Improving Classification Accuracy on Uncertain Data by Considering Multiple Subclasses. In: Thielscher, M., Zhang, D. (eds) AI 2012: Advances in Artificial Intelligence. AI 2012. Lecture Notes in Computer Science(), vol 7691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35101-3_63

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35101-3_63

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35100-6

  • Online ISBN: 978-3-642-35101-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics