Skip to main content

Positive and Unlabeled Relational Classification Through Label Frequency Estimation

  • Conference paper
  • First Online:
Inductive Logic Programming (ILP 2017)

Abstract

Many applications, such as knowledge base completion and automated diagnosis of patients, only have access to positive examples but lack negative examples which are required by standard relational learning techniques and suffer under the closed-world assumption. The corresponding propositional problem is known as Positive and Unlabeled (PU) learning. In this field, it is known that using the label frequency (the fraction of true positive examples that are labeled) makes learning easier. This notion has not been explored yet in the relational domain. The goal of this work is twofold: (1) to explore if using the label frequency would also be useful when working with relational data and (2) to propose a method for estimating the label frequency from relational positive and unlabeled data. Our experiments confirm the usefulness of knowing the label frequency and of our estimate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://alchemy.cs.washington.edu/data/.

  2. 2.

    http://www.cs.ox.ac.uk/activities/machlearn/mutagenesis.html.

  3. 3.

    http://www.cs.ox.ac.uk/activities/machinelearning/Aleph/aleph.

References

  1. Zupanc, K., Davis, J.: Estimating rule quality for knowledge base completion with the relationship between coverage assumption. In: Proceedings of the 27th International Conference on World Wide Web (WWW 2018) (2018)

    Google Scholar 

  2. Claesen, M., De Smet, F., Gillard, P., Mathieu, C., De Moor, B.: Building classifiers to predict the start of glucose-lowering pharmacotherapy using Belgian health expenditure data. arXiv preprint arXiv:1504.07389 (2015)

  3. Muggleton, S.: Learning from positive data. In: Selected Papers from the 6th International Workshop on Inductive Logic Programming, pp. 358–376 (1996)

    Chapter  Google Scholar 

  4. McCreath, E., Sharma, A.: ILP with noise and fixed example size: a Bayesian approach. In: Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pp. 1310–1315 (1997)

    Google Scholar 

  5. Schoenmackers, S., Davis, J., Etzioni, O., Weld, D.S.: Learning first-order Horn clauses from web text. In: Proceedings of Conference on Empirical Methods on Natural Language Processing (2010)

    Google Scholar 

  6. Elkan, C., Noto, K.: Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 213–220 (2008)

    Google Scholar 

  7. du Plessis, M.C., Niu, G., Sugiyama, M.: Class-prior estimation for learning from positive and unlabeled data. Mach. Learn., 1–30 (2015)

    Google Scholar 

  8. Jain, S., White, M., Radivojac, P.: Estimating the class prior and posterior from noisy positives and unlabeled data. In: Advances in Neural Information Processing Systems (2016)

    Google Scholar 

  9. Ramaswamy, H.G., Scott, C., Tewari, A.: Mixture proportion estimation via kernel embedding of distributions. In: Proceedings of International Conference on Machine Learning (2016)

    Google Scholar 

  10. Bekker, J., Davis, J.: Estimating the class prior in positive and unlabeled data through decision tree induction. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  11. Liu, B., Lee, W.S., Yu, P.S., Li, X.: Partially supervised classification of text documents. In: Proceedings of the International Conference on Machine Learning, pp. 387–394 (2002)

    Google Scholar 

  12. Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: Proceedings of the International Joint Conference on Artifical Intelligence, pp. 587–592 (2003)

    Google Scholar 

  13. Yu, H.: Single-class classification with mapping convergence. Mach. Learn. 61(1–3), 49–69 (2005)

    Article  Google Scholar 

  14. Li, X.L., Yu, P.S., Liu, B., Ng, S.K.: Positive unlabeled learning for data stream classification. In: Proceedings of the 2009 SIAM International Conference on Data Mining, pp. 259–270 (2009)

    Chapter  Google Scholar 

  15. Nguyen, M.N., Li, X.L., Ng, S.K.: Positive unlabeled leaning for time series classification. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1421–1426 (2011)

    Google Scholar 

  16. Lee, W.S., Liu, B.: Learning with positive and unlabeled examples using weighted logistic regression. In: Proceedings of the International Conference on Machine Learning, vol. 3, pp. 448–455 (2003)

    Google Scholar 

  17. Liu, Z., Shi, W., Li, D., Qin, Q.: Partially supervised classification-based on weighted unlabeled samples support vector machine. In: International Conference on Advanced Data Mining and Applications, pp. 118–129 (2005)

    Chapter  Google Scholar 

  18. Mordelet, F., Vert, J.P.: A bagging SVM to learn from positive and unlabeled examples. Pattern Recogn. Lett. 37, 201–209 (2014)

    Article  Google Scholar 

  19. Claesen, M., De Smet, F., Suykens, J.A., De Moor, B.: A robust ensemble approach to learn from positive and unlabeled data using SVM base models. Neurocomputing 160, 73–84 (2015)

    Article  Google Scholar 

  20. Denis, F.Ç.: PAC learning from positive statistical queries. In: Richter, M.M., Smith, C.H., Wiehagen, R., Zeugmann, T. (eds.) ALT 1998. LNCS (LNAI), vol. 1501, pp. 112–126. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49730-7_9

    Chapter  Google Scholar 

  21. Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.S.: Building text classifiers using positive and unlabeled examples. In: Proceedings of the Third IEEE International Conference on Data Mining, pp. 179–186 (2003)

    Google Scholar 

  22. Zhang, D., Lee, W.S.: A simple probabilistic approach to learning from positive and unlabeled examples. In: Proceedings of the 5th Annual UK Workshop on Computational Intelligence, pp. 83–87 (2005)

    Google Scholar 

  23. Denis, F., Gilleron, R., Letouzey, F.: Learning from positive and unlabeled examples. Theoret. Comput. Sci. 348(1), 70–83 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  24. du Plessis, M.C., Sugiyama, M.: Class prior estimation from positive and unlabeled data. IEICE Trans. 97-D, 1358–1362 (2014)

    Google Scholar 

  25. Khot, T., Natarajan, S., Shavlik, J.W.: Relational one-class classification: a non-parametric approach. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence (2014)

    Google Scholar 

  26. Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.: Fast rule mining in ontological knowledge bases with amie+. The VLDB J. 24(6), 707–730 (2015)

    Article  Google Scholar 

  27. Lao, N., Subramanya, A., Pereira, F., Cohen, W.W.: Reading the web with learned syntactic-semantic inference rules. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1017–1026 (2012)

    Google Scholar 

  28. Socher, R., Chen, D., Manning, C.D., Ng, A.: Reasoning with neural tensor networks for knowledge base completion. In: Advances in Neural Information Processing Systems 26, pp. 926–934 (2013)

    Google Scholar 

  29. Gardner, M., Talukdar, P.P., Krishnamurthy, J., Mitchell, T.M.: Incorporating vector space similarity in random walk inference over knowledge bases. In: EMNLP (2014)

    Google Scholar 

  30. Neelakantan, A., Roth, B., McCallum, A.: Compositional vector space models for knowledge base completion. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (2015)

    Google Scholar 

  31. De Comité, F., Denis, F., Gilleron, R., Letouzey, F.: Positive and unlabeled examples help learning. In: Watanabe, O., Yokomori, T. (eds.) ALT 1999. LNCS (LNAI), vol. 1720, pp. 219–230. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-46769-6_18

    Chapter  Google Scholar 

  32. Blockeel, H., De Raedt, L.: Top-down induction of first-order logical decision trees. Artif. Intell. 101, 285–297 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  33. Srinivasan, A.: The Aleph manual (2001)

    Google Scholar 

Download references

Acknowledgements

JB is supported by IWT (SB/141744). JD is partially supported by the KU Leuven Research Fund (C14/17/070, C32/17/036), FWO-Vlaanderen (SBO-150033, G066818N, EOS-30992574, T004716N), Chist-Era ReGround project, and EU VA project Nano4Sports.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jessa Bekker .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bekker, J., Davis, J. (2018). Positive and Unlabeled Relational Classification Through Label Frequency Estimation. In: Lachiche, N., Vrain, C. (eds) Inductive Logic Programming. ILP 2017. Lecture Notes in Computer Science(), vol 10759. Springer, Cham. https://doi.org/10.1007/978-3-319-78090-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-78090-0_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-78089-4

  • Online ISBN: 978-3-319-78090-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics