Probabilistic Bounds for Binary Classification of Large Data Sets

  • Věra Kůrková
  • Marcello SanguinetiEmail author
Conference paper
Part of the Proceedings of the International Neural Networks Society book series (INNS, volume 1)


A probabilistic model for classification of task relevance is investigated. Correlations between randomly-chosen functions and network input-output functions are estimated. Impact of large data sets is analyzed from the point of view of the concentration of measure phenomenon. The Azuma-Hoeffding Inequality is exploited, which can be applied also when the naive Bayes assumption is not satisfied (i.e., when assignments of class labels to feature vectors are not independent).


Binary classification Approximation by feedforward networks Concentration of measure Azuma-Hoeffding inequality 



V.K. was partially supported by the Czech Grant Foundation grant GA 18-23827S and by institutional support of the Institute of Computer Science RVO 67985807. M.S. was partially supported by a FFABR grant of the Italian Ministry of Education, University and Research (MIUR). He is Research Associate at INM (Institute for Marine Engineering) of CNR (National Research Council of Italy) under the Project PDGP 2018/20 DIT.AD016.001 “Technologies for Smart Communities” and he is a member of GNAMPA-INdAM (Gruppo Nazionale per l’Analisi Matematica, la Probabilità e le loro Applicazioni - Instituto Nazionale di Alta Matematica).


  1. 1.
    Azuma, K.: Weighted sums of certain dependent random variables. Tohoku Math. J. 19, 357–367 (1967)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Bengio, Y., Courville, A.: Deep learning of representations. In: Bianchini, M., Maggini, M., Jain, L. (eds.) Handbook of Neural Information Processing. Springer, Heidelberg (2013)Google Scholar
  3. 3.
    Chung, F., Lui, L.: Concentration inequalities and martingale inequalities: a survey. Internet Math. 3, 79–127 (2005)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Cucker, F., Smale, S.: On the mathematical foundations of learning. Bull. Am. Math. Soc. 39, 1–49 (2002)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Doerr, B.: Analyzing randomized search heuristics: tools from probability theory. In: Theory of Randomized Search Heuristics - Foundations and Recent Developments, chap. 1, pp. 1–20. World Scientific Publishing (2011)Google Scholar
  6. 6.
    Dubhashi, D., Panconesi, A.: Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, Cambridge (2009)CrossRefGoogle Scholar
  7. 7.
    Gorban, A.N., Golubkov, A., Grechuk, B., Mirkes, E.M., Tyukin, I.Y.: Correction of AI systems by linear discriminants: probabilistic foundations. Inf. Sci. 466, 303–322 (2018)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Gorban, A., Tyukin, I.: Stochastic separation theorems. Neural Netw. 94, 255–259 (2017)CrossRefGoogle Scholar
  9. 9.
    Ito, Y.: Finite mapping by neural networks and truth functions. Math. Sci. 17, 69–77 (1992)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Kůrková, V., Sanguineti, M.: Probabilistic lower bounds for approximation by shallow perceptron networks. Neural Netw. 91, 34–41 (2017)CrossRefGoogle Scholar
  11. 11.
    Kůrková, V., Sanguineti, M.: Probabilistic bounds on complexity of networks computing binary classification tasks. In: Krajči, S. (ed.) Proceedings of ITAT 2018. CEUR Workshop Proceedings, vol. 2203, pp. 86–91 (2018)Google Scholar
  12. 12.
    Kůrková, V., Sanguineti, M.: Classification by sparse neural networks. IEEE Trans. Neural Netw. Learn. Syst. (2019).
  13. 13.
    Ledoux, M.: The Concentration of Measure Phenomenon. AMS, Providence (2001)Google Scholar
  14. 14.
    Lin, H., Tegmark, M., Rolnick, D.: Why does deep and cheap learning work so well? J. Stat. Phys. 168, 1223–1247 (2017)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Mhaskar, H.N., Poggio, T.: Deep vs. shallow networks: an approximation theory perspective. Anal. Appl. 14, 829–848 (2016)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Rennie, J., Shih, L., Teevan, J., Karger, D.: Tackling the poor assumptions of Naive Bayes classifiers. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003) (2003)Google Scholar
  17. 17.
    Tropp, A.: Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50, 2231–2242 (2004)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1997)zbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Institute of Computer ScienceCzech Academy of SciencesPragueCzech Republic
  2. 2.Department of Computer Science, Bioengineering, Robotics, and Systems Engineering (DIBRIS)University of GenoaGenoaItaly

Personalised recommendations