Positive and Unlabeled Relational Classification Through Label Frequency Estimation

Bekker, Jessa; Davis, Jesse

doi:10.1007/978-3-319-78090-0_2

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10759))

Included in the following conference series:

International Conference on Inductive Logic Programming

613 Accesses
3 Citations

Abstract

Many applications, such as knowledge base completion and automated diagnosis of patients, only have access to positive examples but lack negative examples which are required by standard relational learning techniques and suffer under the closed-world assumption. The corresponding propositional problem is known as Positive and Unlabeled (PU) learning. In this field, it is known that using the label frequency (the fraction of true positive examples that are labeled) makes learning easier. This notion has not been explored yet in the relational domain. The goal of this work is twofold: (1) to explore if using the label frequency would also be useful when working with relational data and (2) to propose a method for estimating the label frequency from relational positive and unlabeled data. Our experiments confirm the usefulness of knowing the label frequency and of our estimate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Zupanc, K., Davis, J.: Estimating rule quality for knowledge base completion with the relationship between coverage assumption. In: Proceedings of the 27th International Conference on World Wide Web (WWW 2018) (2018)
Google Scholar
Claesen, M., De Smet, F., Gillard, P., Mathieu, C., De Moor, B.: Building classifiers to predict the start of glucose-lowering pharmacotherapy using Belgian health expenditure data. arXiv preprint arXiv:1504.07389 (2015)
Muggleton, S.: Learning from positive data. In: Selected Papers from the 6th International Workshop on Inductive Logic Programming, pp. 358–376 (1996)
Chapter Google Scholar
McCreath, E., Sharma, A.: ILP with noise and fixed example size: a Bayesian approach. In: Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pp. 1310–1315 (1997)
Google Scholar
Schoenmackers, S., Davis, J., Etzioni, O., Weld, D.S.: Learning first-order Horn clauses from web text. In: Proceedings of Conference on Empirical Methods on Natural Language Processing (2010)
Google Scholar
Elkan, C., Noto, K.: Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 213–220 (2008)
Google Scholar
du Plessis, M.C., Niu, G., Sugiyama, M.: Class-prior estimation for learning from positive and unlabeled data. Mach. Learn., 1–30 (2015)
Google Scholar
Jain, S., White, M., Radivojac, P.: Estimating the class prior and posterior from noisy positives and unlabeled data. In: Advances in Neural Information Processing Systems (2016)
Google Scholar
Ramaswamy, H.G., Scott, C., Tewari, A.: Mixture proportion estimation via kernel embedding of distributions. In: Proceedings of International Conference on Machine Learning (2016)
Google Scholar
Bekker, J., Davis, J.: Estimating the class prior in positive and unlabeled data through decision tree induction. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Liu, B., Lee, W.S., Yu, P.S., Li, X.: Partially supervised classification of text documents. In: Proceedings of the International Conference on Machine Learning, pp. 387–394 (2002)
Google Scholar
Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: Proceedings of the International Joint Conference on Artifical Intelligence, pp. 587–592 (2003)
Google Scholar
Yu, H.: Single-class classification with mapping convergence. Mach. Learn. 61(1–3), 49–69 (2005)
Article Google Scholar
Li, X.L., Yu, P.S., Liu, B., Ng, S.K.: Positive unlabeled learning for data stream classification. In: Proceedings of the 2009 SIAM International Conference on Data Mining, pp. 259–270 (2009)
Chapter Google Scholar
Nguyen, M.N., Li, X.L., Ng, S.K.: Positive unlabeled leaning for time series classification. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1421–1426 (2011)
Google Scholar
Lee, W.S., Liu, B.: Learning with positive and unlabeled examples using weighted logistic regression. In: Proceedings of the International Conference on Machine Learning, vol. 3, pp. 448–455 (2003)
Google Scholar
Liu, Z., Shi, W., Li, D., Qin, Q.: Partially supervised classification-based on weighted unlabeled samples support vector machine. In: International Conference on Advanced Data Mining and Applications, pp. 118–129 (2005)
Chapter Google Scholar
Mordelet, F., Vert, J.P.: A bagging SVM to learn from positive and unlabeled examples. Pattern Recogn. Lett. 37, 201–209 (2014)
Article Google Scholar
Claesen, M., De Smet, F., Suykens, J.A., De Moor, B.: A robust ensemble approach to learn from positive and unlabeled data using SVM base models. Neurocomputing 160, 73–84 (2015)
Article Google Scholar
Denis, F.Ç.: PAC learning from positive statistical queries. In: Richter, M.M., Smith, C.H., Wiehagen, R., Zeugmann, T. (eds.) ALT 1998. LNCS (LNAI), vol. 1501, pp. 112–126. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49730-7_9
Chapter Google Scholar
Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.S.: Building text classifiers using positive and unlabeled examples. In: Proceedings of the Third IEEE International Conference on Data Mining, pp. 179–186 (2003)
Google Scholar
Zhang, D., Lee, W.S.: A simple probabilistic approach to learning from positive and unlabeled examples. In: Proceedings of the 5th Annual UK Workshop on Computational Intelligence, pp. 83–87 (2005)
Google Scholar
Denis, F., Gilleron, R., Letouzey, F.: Learning from positive and unlabeled examples. Theoret. Comput. Sci. 348(1), 70–83 (2005)
Article MathSciNet MATH Google Scholar
du Plessis, M.C., Sugiyama, M.: Class prior estimation from positive and unlabeled data. IEICE Trans. 97-D, 1358–1362 (2014)
Google Scholar
Khot, T., Natarajan, S., Shavlik, J.W.: Relational one-class classification: a non-parametric approach. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence (2014)
Google Scholar
Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.: Fast rule mining in ontological knowledge bases with amie+. The VLDB J. 24(6), 707–730 (2015)
Article Google Scholar
Lao, N., Subramanya, A., Pereira, F., Cohen, W.W.: Reading the web with learned syntactic-semantic inference rules. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1017–1026 (2012)
Google Scholar
Socher, R., Chen, D., Manning, C.D., Ng, A.: Reasoning with neural tensor networks for knowledge base completion. In: Advances in Neural Information Processing Systems 26, pp. 926–934 (2013)
Google Scholar
Gardner, M., Talukdar, P.P., Krishnamurthy, J., Mitchell, T.M.: Incorporating vector space similarity in random walk inference over knowledge bases. In: EMNLP (2014)
Google Scholar
Neelakantan, A., Roth, B., McCallum, A.: Compositional vector space models for knowledge base completion. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (2015)
Google Scholar
De Comité, F., Denis, F., Gilleron, R., Letouzey, F.: Positive and unlabeled examples help learning. In: Watanabe, O., Yokomori, T. (eds.) ALT 1999. LNCS (LNAI), vol. 1720, pp. 219–230. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-46769-6_18
Chapter Google Scholar
Blockeel, H., De Raedt, L.: Top-down induction of first-order logical decision trees. Artif. Intell. 101, 285–297 (1998)
Article MathSciNet MATH Google Scholar
Srinivasan, A.: The Aleph manual (2001)
Google Scholar

Download references

Acknowledgements

JB is supported by IWT (SB/141744). JD is partially supported by the KU Leuven Research Fund (C14/17/070, C32/17/036), FWO-Vlaanderen (SBO-150033, G066818N, EOS-30992574, T004716N), Chist-Era ReGround project, and EU VA project Nano4Sports.

Author information

Authors and Affiliations

Computer Science Department, KU Leuven, Leuven, Belgium
Jessa Bekker & Jesse Davis

Authors

Jessa Bekker
View author publications
You can also search for this author in PubMed Google Scholar
Jesse Davis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jessa Bekker .

Editor information

Editors and Affiliations

University of Strasbourg, Strasbourg, France
Nicolas Lachiche
University of Orléans, Orléans, France
Christel Vrain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bekker, J., Davis, J. (2018). Positive and Unlabeled Relational Classification Through Label Frequency Estimation. In: Lachiche, N., Vrain, C. (eds) Inductive Logic Programming. ILP 2017. Lecture Notes in Computer Science(), vol 10759. Springer, Cham. https://doi.org/10.1007/978-3-319-78090-0_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-78090-0_2
Published: 15 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78089-4
Online ISBN: 978-3-319-78090-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics