Probabilistic Bounds for Binary Classification of Large Data Sets
A probabilistic model for classification of task relevance is investigated. Correlations between randomly-chosen functions and network input-output functions are estimated. Impact of large data sets is analyzed from the point of view of the concentration of measure phenomenon. The Azuma-Hoeffding Inequality is exploited, which can be applied also when the naive Bayes assumption is not satisfied (i.e., when assignments of class labels to feature vectors are not independent).
KeywordsBinary classification Approximation by feedforward networks Concentration of measure Azuma-Hoeffding inequality
V.K. was partially supported by the Czech Grant Foundation grant GA 18-23827S and by institutional support of the Institute of Computer Science RVO 67985807. M.S. was partially supported by a FFABR grant of the Italian Ministry of Education, University and Research (MIUR). He is Research Associate at INM (Institute for Marine Engineering) of CNR (National Research Council of Italy) under the Project PDGP 2018/20 DIT.AD016.001 “Technologies for Smart Communities” and he is a member of GNAMPA-INdAM (Gruppo Nazionale per l’Analisi Matematica, la Probabilità e le loro Applicazioni - Instituto Nazionale di Alta Matematica).
- 2.Bengio, Y., Courville, A.: Deep learning of representations. In: Bianchini, M., Maggini, M., Jain, L. (eds.) Handbook of Neural Information Processing. Springer, Heidelberg (2013)Google Scholar
- 5.Doerr, B.: Analyzing randomized search heuristics: tools from probability theory. In: Theory of Randomized Search Heuristics - Foundations and Recent Developments, chap. 1, pp. 1–20. World Scientific Publishing (2011)Google Scholar
- 11.Kůrková, V., Sanguineti, M.: Probabilistic bounds on complexity of networks computing binary classification tasks. In: Krajči, S. (ed.) Proceedings of ITAT 2018. CEUR Workshop Proceedings, vol. 2203, pp. 86–91 (2018)Google Scholar
- 12.Kůrková, V., Sanguineti, M.: Classification by sparse neural networks. IEEE Trans. Neural Netw. Learn. Syst. (2019). https://doi.org/10.1109/TNNLS.2018.2888517
- 13.Ledoux, M.: The Concentration of Measure Phenomenon. AMS, Providence (2001)Google Scholar
- 16.Rennie, J., Shih, L., Teevan, J., Karger, D.: Tackling the poor assumptions of Naive Bayes classifiers. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003) (2003)Google Scholar