Skip to main content

Bayesian Classifiers for Positive Unlabeled Learning

  • Conference paper
Web-Age Information Management (WAIM 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6897))

Included in the following conference series:

Abstract

This paper studies the problem of Positive Unlabeled learning (PU learning), where positive and unlabeled examples are used for training. Naive Bayes (NB) and Tree Augmented Naive Bayes (TAN) have been extended to PU learning algorithms (PNB and PTAN). However, they require user-specified parameter, which is difficult for the user to provide in practice. We estimate this parameter following [2] by taking the “selected completely at random” assumption and reformulate these two algorithms with this assumption. Furthermore, based on supervised algorithms Averaged One-Dependence Estimators (AODE), Hidden Naive Bayes (HNB) and Full Bayesian network Classifier (FBC), we extend these algorithms to PU learning algorithms (PAODE, PHNB and PFBC respectively). Experimental results on 20 UCI datasets show that the performance of the Bayesian algorithms for PU learning are comparable to corresponding supervised ones in most cases. Additionally, PNB and PFBC are more robust against unlabeled data, and PFBC generally performs the best.

This work is supported by the National Natural Science Foundation of China (60873196) and Chinese Universities Scientific Fund (QN2009092).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhang, D., Lee, W.S.: A Simple Probabilistic Approach to Learning from Positive and Unlabeled Examples. In: Proc. of UKCI 2005, pp. 83–87 (2005)

    Google Scholar 

  2. Elkan, C., Noto, K.: Learning Classifiers from Only Positive and Unlabeled Data. In: Proc. of SIGKDD 2008, pp. 213–220 (2008)

    Google Scholar 

  3. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29(2), 131–163 (1997)

    Article  MATH  Google Scholar 

  4. Webb, G.I., Boughton, J.R., Wang, Z.: Not So Naive Bayes: Aggregating One-Dependence Estimators. Machine Learning 58(1), 5–24 (2005)

    Article  MATH  Google Scholar 

  5. Jiang, L., Zhang, H., Cai, Z.: A Novel Bayes Model: Hidden Naive Bayes. IEEE Transactions on Knowledge and Data Engineering 21(10), 1361–1371 (2009)

    Article  Google Scholar 

  6. Su, J., Zhang, H.: Full Bayesian Network Classifiers. In: Proc. of the 23rd ICML, pp. 897–904 (2006)

    Google Scholar 

  7. Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A., Williamson, R.: Estimating the Support of a High-Dimensional Distribution. Neural Computation 13(7), 1443–1471 (2001)

    Article  MATH  Google Scholar 

  8. Yu, H., Han, J., Chang, K.C.: PEBL: Positive Example Based Learning for Web Page Classification Using SVM. In: Proc. of the 8th SIGKDD, pp. 239–248 (2002)

    Google Scholar 

  9. Liu, B., Lee, W.S., Yu, P.S., Li, X.: Partially Supervised Classification of Text Documents. In: Proc. of the 9th ICML, pp. 387–394 (2002)

    Google Scholar 

  10. Li, X., Liu, B.: Learning to Classify Texts Using Positive and Unlabeled Data. In: Proc. of the 18th IJCAI, pp. 587–592 (2003)

    Google Scholar 

  11. Lee, W.S., Liu, B.: Learning with Positive and Unlabeled Examples Using Weighted Logistic Regression. In: Proc. of the 3rd ICDE, pp. 448–455 (2003)

    Google Scholar 

  12. Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.S.: BuildingText Classifiers Using Positive and Unlabeled Examples. In: Proc. of the 3rd ICDM, pp. 179–186 (2003)

    Google Scholar 

  13. Denis, F., Gilleron, R., Tommasi, M.: Text Classification from Positive and Unlabeled Examples. In: Proc. of the 9th IPMU, pp. 1927–1934 (2002)

    Google Scholar 

  14. Denis, F., Gilleron, R., Letouzey, F.: Learning from Positive and Unlabeled Examples. Theoretical Computer Science 38(1), 70–83 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  15. Zhang, Y., Li, X., Orlowska, M.: One-Class Classification of Text Streams with Concept Drift. In: Proc. of ICDMW, pp. 116–125 (2008)

    Google Scholar 

  16. Li, X.L., Yu, P.S., Liu, B., Ng, S.K.: Positive Unlabeled Learning for Data Stream Classification. In: Proc. of the 9th SIAM SDM, pp. 257–268 (2009)

    Google Scholar 

  17. He, J., Zhang, Y., Li, X., Wang, Y.: Naive Bayes Classifier for Positive Unlabeled Learning with Uncertainty. In: Proc. of the 10th SIAM SDM, pp. 361–372 (2010)

    Google Scholar 

  18. Calvo, B., Larranaga, P., Lozano, J.A.: Learning Bayesian Classifiers from Positive and Unlabeled Examples. Pattern Recognition Letters 28(16), 2375–2384 (2007)

    Article  Google Scholar 

  19. Zadrozny, B., Elkan, C.: Transforming Classifier Scores into Accurate Multiclass Probability Estimates. In: Proc. of the 8th SIGKDD, pp. 694–699 (2002)

    Google Scholar 

  20. Blake, C.L., Merz, C.J.: UCI repository of machine learning databases, http://www.ics.uci.edu/~mlearn/MLRepository.html

  21. Zhang, H., Jiang, L., Su, J.: Augmenting Naive Bayes for Ranking. In: Proc. of the 22nd ICML, pp. 1020–1027 (2005)

    Google Scholar 

  22. Zhang, H., Jiang, L., Su, J.: Learning Weighted Naive Bayes with Accurate Ranking. In: Proc. of the 4th ICDM, pp. 567–570 (2004)

    Google Scholar 

  23. Su, J., Zhang, H.: Learning Conditional Independence Tree for Ranking. In: Proc. of the 4th ICDM, pp. 531–534 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

He, J., Zhang, Y., Li, X., Wang, Y. (2011). Bayesian Classifiers for Positive Unlabeled Learning. In: Wang, H., Li, S., Oyama, S., Hu, X., Qian, T. (eds) Web-Age Information Management. WAIM 2011. Lecture Notes in Computer Science, vol 6897. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23535-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23535-1_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23534-4

  • Online ISBN: 978-3-642-23535-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics