Epistemic Uncertainty Sampling

  • Vu-Linh NguyenEmail author
  • Sébastien Destercke
  • Eyke Hüllermeier
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11828)


Various strategies for active learning have been proposed in the machine learning literature. In uncertainty sampling, which is among the most popular approaches, the active learner sequentially queries the label of those instances for which its current prediction is maximally uncertain. The predictions as well as the measures used to quantify the degree of uncertainty, such as entropy, are almost exclusively of a probabilistic nature. In this paper, we advocate a distinction between two different types of uncertainty, referred to as epistemic and aleatoric, in the context of active learning. Roughly speaking, these notions capture the reducible and the irreducible part of the total uncertainty in a prediction, respectively. We conjecture that, in uncertainty sampling, the usefulness of an instance is better reflected by its epistemic than by its aleatoric uncertainty. This leads us to suggest the principle of “epistemic uncertainty sampling”, which we instantiate by means of a concrete approach for measuring epistemic and aleatoric uncertainty. In experimental studies, epistemic uncertainty sampling does indeed show promising performance.


Active learning Uncertainty sampling Epistemic uncertainty Aleatoric uncertainty 



This work was supported by the German Research Foundation (DFG) and the French National Agency for Research (Labex MS2T).


  1. 1.
    Antonucci, A., Corani, G., Gabaglio, S.: Active learning by the naive credal classifier. In: Proceedings of the Sixth European Workshop on Probabilistic Graphical Models (PGM), pp. 3–10 (2012)Google Scholar
  2. 2.
    Birnbaum, A.: On the foundations of statistical inference. J. Am. Stat. Assoc. 57(298), 269–306 (1962)MathSciNetzbMATHCrossRefGoogle Scholar
  3. 3.
    Bottou, L., Vapnik, V.: Local learning algorithms. Neural Comput. 4(6), 888–900 (1992)CrossRefGoogle Scholar
  4. 4.
    Chapelle, O.: Active learning for Parzen window classifier. In: Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (AISTATS), vol. 5, pp. 49–56 (2005)Google Scholar
  5. 5.
    Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)zbMATHCrossRefGoogle Scholar
  6. 6.
    Fu, Y., Zhu, X., Li, B.: A survey on instance selection for active learning. Knowl. Inf. Syst. 35(2), 249–283 (2013) CrossRefGoogle Scholar
  7. 7.
    Hastie, T., Tibshirani, R., Friedman, J., Franklin, J.: The elements of statistical learning: data mining, inference and prediction. Math. Intelligencer 27(2), 83–85 (2005)Google Scholar
  8. 8.
    Hora, S.C.: Aleatory and epistemic uncertainty in probability elicitation with an example from hazardous waste management. Reliab. Eng. Syst. Saf. 54(2–3), 217–223 (1996)CrossRefGoogle Scholar
  9. 9.
    Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? In: Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS) (2017)Google Scholar
  10. 10.
    Lall, U., Sharma, A.: A nearest neighbor bootstrap for resampling hydrologic time series. Water Resour. Res. 32(3), 679–693 (1996)CrossRefGoogle Scholar
  11. 11.
    Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: Croft, B.W., van Rijsbergen, C.J. (eds.) SIGIR 1994, pp. 3–12. Springer, London (1994). Scholar
  12. 12.
    Menard, S.: Applied Logistic Regression Analysis, vol. 106. Sage, Thousand Oaks (2002)CrossRefGoogle Scholar
  13. 13.
    Nguyen, V.L., Destercke, S., Masson, M.H., Hüllermeier, E.: Reliable multi-class classification based on pairwise epistemic and aleatoric uncertainty. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), pp. 5089–5095. AAAI Press (2018)Google Scholar
  14. 14.
    Nocedal, J., Wright, S.: Numerical Optimization. Springer Series in Operations Research and Financial Engineering. Springer, New York (2006). Scholar
  15. 15.
    Philip, E., Elizabeth, W.: Sequential quadratic programming methods. UCSD Department of Mathematics, Technical report NA-10-03 (2010)Google Scholar
  16. 16.
    Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
  17. 17.
    Rennie, J.D.: Regularized logistic regression is strictly convex. Technical report, MIT (2005)Google Scholar
  18. 18.
    Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man. Cybern. 21(3), 660–674 (1991)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Senge, R., et al.: Reliable classification: Learning classifiers that distinguish aleatoric and epistemic uncertainty. Inf. Sci. 255, 16–29 (2014)MathSciNetzbMATHCrossRefGoogle Scholar
  20. 20.
    Settles, B.: Active learning literature survey. Technical report, University of Wisconsin, Madison, vol. 52, no. 55–66, p. 11 (2010)Google Scholar
  21. 21.
    Sharma, M., Bilgic, M.: Evidence-based uncertainty sampling for active learning. Data Min. Knowl. Disc. 31(1), 164–202 (2017)MathSciNetzbMATHCrossRefGoogle Scholar
  22. 22.
    Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Networks 10(5), 988–999 (1999)CrossRefGoogle Scholar
  23. 23.
    Walley, P., Moral, S.: Upper probabilities based only on the likelihood function. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 61(4), 831–847 (1999)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Vu-Linh Nguyen
    • 1
    Email author
  • Sébastien Destercke
    • 2
  • Eyke Hüllermeier
    • 1
  1. 1.Heinz Nixdorf Institute, Department of Computer SciencePaderborn UniversityPaderbornGermany
  2. 2.UMR CNRS 7253 Heudiasyc, Sorbonne Universités, Université de Technologie de CompiègneCompiègneFrance

Personalised recommendations