Generalization bounds for learning under graph-dependence: a survey

Zhang, Rui-Ray; Amini, Massih-Reza

doi:10.1007/s10994-024-06536-9

Generalization bounds for learning under graph-dependence: a survey

Published: 03 April 2024

(2024)
Cite this article

Machine Learning Aims and scope Submit manuscript

92 Accesses
2 Altmetric
Explore all metrics

Abstract

Traditional statistical learning theory relies on the assumption that data are identically and independently distributed (i.i.d.). However, this assumption often does not hold in many real-life applications. In this survey, we explore learning scenarios where examples are dependent and their dependence relationship is described by a dependency graph, a commonly utilized model in probability and combinatorics. We collect various graph-dependent concentration bounds, which are then used to derive Rademacher complexity and stability generalization bounds for learning from graph-dependent data. We illustrate this paradigm through practical learning tasks and provide some research directions for future work. To our knowledge, this survey is the first of this kind on this subject.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Relative deviation learning bounds and generalization with unbounded loss functions

Article 08 January 2019

Global Rademacher Complexity Bounds: From Slow to Fast Convergence Rates

Article 08 May 2015

Towards a Framework for Learning from Networked Data

Availability of data and materials

Not applicable.

Code availability

Not applicable.

References

Agarwal, S., & Niyogi, P. (2009). Generalization bounds for ranking algorithms via algorithmic stability. Journal of Machine Learning Research, 10(16), 441–474.
MathSciNet Google Scholar
Amini, M. R., & Usunier, N. (2015). Learning with partially labeled and interdependent data. Springer.
Book Google Scholar
Anselin, L. (2013). Spatial econometrics: Methods and models (Vol. 4). Springer.
Google Scholar
Baldi, P., & Rinott, Y. (1989). On normal approximations of distributions in terms of dependency graphs. The Annals of Probability, 17(4), 1646–1650.
Article MathSciNet Google Scholar
Bartlett, P. L., & Mendelson, S. (2002). Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3, 463–482.
MathSciNet Google Scholar
Betlei, A., Diemert, E., & Amini, M. (2021). Uplift modeling with generalization guarantees. In 27th ACM SIGKDD conference on knowledge discovery and data mining (pp. 55–65).
Bollobás, B. (1998). Modern graph theory (Vol. 184). Springer.
Google Scholar
Boser, B., Guyon, I., & Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the 5th annual workshop on computational learning theory (COLT’92) (pp. 144–152).
Boucheron, S., Lugosi, G., & Massart, P. (2013). Concentration inequalities: A nonasymptotic theory of independence. Oxford University Press.
Book Google Scholar
Bousquet, O., & Elisseeff, A. (2002). Stability and generalization. Journal of Machine Learning Research, 2, 499–526.
MathSciNet Google Scholar
Bousquet, O., Klochkov, Y., & Zhivotovskiy, N. (2020). Sharper bounds for uniformly stable algorithms. In Conference on learning theory (pp. 610–626). PMLR.
Chen, L. H. (1975). Poisson approximation for dependent trials. The Annals of Probability, 534–545
Chen, L. H. (1978). Two central limit problems for dependent random variables. Probability Theory and Related Fields, 43(3), 223–243.
MathSciNet Google Scholar
Cortes, C., & Mohri, M. (2004). AUC optimization vs. error rate minimization. In Advances in neural information processing systems.
Dehling, H., & Philipp, W. (2002). Empirical process techniques for dependent data. In Empirical process techniques for dependent data (pp. 3–113). Springer.
Devroye, L., & Wagner, T. (1979). Distribution-free performance bounds for potential function rules. IEEE Transactions on Information Theory, 25(5), 601–604.
Article MathSciNet Google Scholar
Dousse, J., & Féray, V. (2019). Weighted dependency graphs and the Ising model. Annales de l’Institut Henri Poincaré D, 6(4), 533–571.
MathSciNet Google Scholar
Erdős, P., & Lovász, L. (1975). Problems and results on 3-chromatic hypergraphs and some related questions. Infinite and Finite Sets, 10(2), 609–627.
MathSciNet Google Scholar
Feldman, V., & Vondrak, J. (2019). High probability generalization bounds for uniformly stable algorithms with nearly optimal rate. In Conference on learning theory (pp. 1270–1279). PMLR.
Féray, V. (2018). Weighted dependency graphs. Electronic Journal of Probability, 23.
Freund, Y., Iyer, R. D., Schapire, R. E., et al. (2003). An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4, 933–969.
MathSciNet Google Scholar
Halin, R. (1991). Tree-partitions of infinite graphs. Discrete Mathematics, 97(1–3), 203–217.
Article MathSciNet Google Scholar
Hang, H., & Steinwart, I. (2014). Fast learning from \(\alpha \)-mixing observations. Journal of Multivariate Analysis, 127, 184–199.
Article MathSciNet Google Scholar
He, F., Zuo, L., & Chen, H. (2016). Stability analysis for ranking with stationary \(\varphi \)-mixing samples. Neurocomputing, 171, 1556–1562.
Article Google Scholar
Hoeffding, W., & Robbins, H. (1948). The central limit theorem for dependent random variables. Duke Mathematical Journal, 15(3), 773–780.
Article MathSciNet Google Scholar
Ibragimov, I. A. (1962). Some limit theorems for stationary processes. Theory of Probability & its Applications, 7(4), 349–382.
Article MathSciNet Google Scholar
Isaev, M., Rodionov, I., & Zhang, R.R. et al (2021). Extremal independence in discrete random systems. arXiv preprint arXiv:2105.04917
Janson, S. (1988). Normal convergence by higher semiinvariants with applications to sums of dependent random variables and random graphs. The Annals of Probability, 16(1), 305–312.
Article MathSciNet Google Scholar
Janson, S. (1990). Poisson approximation for large deviations. Random Structures & Algorithms, 1(2), 221–229.
Article MathSciNet Google Scholar
Janson, S. (2004). Large deviations for sums of partly dependent random variables. Random Structures & Algorithms, 24(3), 234–248.
Article MathSciNet Google Scholar
Janson, S., Łuczak, T., & Rucinski, A. (1988). An exponential bound for the probability of nonexistence of a specified subgraph in a random graph. Institute for Mathematics and its Applications (USA)
Kearns, M., & Ron, D. (1999). Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Computation, 11(6), 1427–1453.
Article Google Scholar
Kirichenko, A., & Van Zanten, H. (2015). Optimality of Poisson processes intensity learning with Gaussian processes. The Journal of Machine Learning Research, 16(1), 2909–2919.
MathSciNet Google Scholar
Kontorovich, L. (2007). Measure concentration of strongly mixing processes with applications. Carnegie Mellon University.
Kontorovich, L. A., & Ramanan, K. (2008). Concentration inequalities for dependent random variables via the martingale method. The Annals of Probability, 36(6), 2126–2158.
Article MathSciNet Google Scholar
Kutin, S., & Niyogi, P. (2002). Almost-everywhere algorithmic stability and generalization error. In Proceedings of the eighteenth conference on uncertainty in artificial intelligence (pp. 275–282). Morgan Kaufmann Publishers Inc.
Kuznetsov, V., & Mohri, M. (2017). Generalization bounds for non-stationary mixing processes. Machine Learning, 106(1), 93–117.
Article MathSciNet Google Scholar
Lampert, C.H., Ralaivola, L., & Zimin, A. (2018). Dependency-dependent bounds for sums of dependent random variables. arXiv preprint arXiv:1811.01404
Ledoux, M., & Talagrand, M. (1991). Probability in Banach spaces: Isoperimetry and processes. Springer.
Book Google Scholar
Linderman, S., & Adams, R. (2014). Discovering latent network structure in point process data. In International conference on machine learning (pp. 1413–1421).
Lozano, A. C., Kulkarni, S. R., & Schapire, R. E. (2006). Convergence and consistency of regularized boosting algorithms with stationary \(\beta \)-mixing observations. In: Advances in neural information processing systems (pp. 819–826).
McDiarmid, C. (1989). On the method of bounded differences. Surveys in Combinatorics, 141(1), 148–188.
MathSciNet Google Scholar
Meir, R. (2000). Nonparametric time series prediction through adaptive model selection. Machine Learning, 39(1), 5–34.
Article Google Scholar
Mohri, M., & Rostamizadeh, A. (2008). Stability bounds for non-i.i.d. processes. In Advances in neural information processing systems (pp. 1025–1032).
Mohri, M., & Rostamizadeh, A. (2009). Rademacher complexity bounds for non-i.i.d. processes. In Advances in neural information processing systems (pp. 1097–1104).
Mohri, M., & Rostamizadeh, A. (2010). Stability bounds for stationary \(\varphi \)-mixing and \(\beta \)-mixing processes. Journal of Machine Learning Research, 11, 789–814.
MathSciNet Google Scholar
Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of machine learning. MIT press.
Google Scholar
Peña, V. H., & Giné, E. (1999). Decoupling: From dependence to independence. Springer.
Book Google Scholar
Ralaivola, L., & Amini, M. R. (2015). Entropy-based concentration inequalities for dependent variables. In International conference on machine learning (pp. 2436–2444).
Ralaivola, L., Szafranski, M., & Stempfel, G. (2010). Chromatic PAC-Bayes bounds for non-iid data: Applications to ranking and stationary \(\beta \)-mixing processes. Journal of Machine Learning Research, 11, 1927–1956.
MathSciNet Google Scholar
Rogers, W. H., & Wagner, T. J. (1978). A finite sample distribution-free performance bound for local discrimination rules. The Annals of Statistics, 506–514.
Rosenblatt, M. (1956). A central limit theorem and a strong mixing condition. Proceedings of the National Academy of Sciences of the United States of America, 42(1), 43.
Article MathSciNet Google Scholar
Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3), 297–336.
Article Google Scholar
Seese, D. (1985). Tree-partite graphs and the complexity of algorithms. In International conference on fundamentals of computation theory (pp.412–421) Springer.
Sidana, S., Trofimov, M., Horodnytskyi, O., et al. (2021). User preference and embedding learning with implicit feedback for recommender systems. Data Mining Knowledge Discovery, 35(2), 568–592.
Article Google Scholar
Stein, C. (1972). A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the sixth Berkeley symposium on mathematical statistics and probability, volume 2: Probability theory. The Regents of the University of California.
Steinwart, I., & Christmann, A. (2009). Fast learning from non-i.i.d. observations. In Advances in neural information processing systems (pp. 1768–1776).
Usunier, N., Amini, M. R., & Gallinari, P. (2005). Generalization error bounds for classifiers trained with interdependent data. Advances in Neural Information Processing Systems, 18, 1369–1376.
Google Scholar
Volkonskii, V., & Rozanov, Y. A. (1959). Some limit theorems for random functions. I. Theory of Probability & its Applications, 4(2), 178–197.
Article MathSciNet Google Scholar
Weston, J., & Watkins, C. (1998). Multi-class support vector machines.
Wood, D. R. (2009). On tree-partition-width. European Journal of Combinatorics, 30(5), 1245–1253.
Article MathSciNet Google Scholar
Yu, B. (1994). Rates of convergence for empirical processes of stationary mixing sequences. The Annals of Probability, 94–116.
Zhang, R. R. (2022). When Janson meets McDiarmid: Bounded difference inequalities under graph-dependence. Statistics & Probability Letters, 181(109), 272.
MathSciNet Google Scholar
Zhang, R. R., Liu, X., Wang, Y., & Wang, L. (2019). McDiarmid-type inequalities for graph-dependent variables and stability bounds. Advances in Neural Information Processing Systems, 32, 10890–10901.
Google Scholar

Download references

Acknowledgements

R.-R. Z. thanks David Wood for email communications on tree-partitions. The authors are sincerely grateful to the referees for carefully reading the manuscript and providing invaluable comments and suggestions, which led to a substantial improvement in the presentation.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Author information

Authors and Affiliations

Barcelona School of Economics, 08005, Barcelona, Catalonia, Spain
Rui-Ray Zhang
School of Mathematics, Monash University, Clayton, VIC, 3168, Australia
Rui-Ray Zhang
LIG/CNRS, University Grenoble Alpes, 38041 CEDEX 9, Grenoble, France
Massih-Reza Amini

Authors

Rui-Ray Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Massih-Reza Amini
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.-R. Z.: the first and final draft, stability bound, and its applications. M.-R. A.: fractional Rademacher complexity bound, and its applications.

Corresponding author

Correspondence to Rui-Ray Zhang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

All authors participated in this study give the publisher the permission to publish this work.

Additional information

Editor: Aryeh Kontorovich.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, RR., Amini, MR. Generalization bounds for learning under graph-dependence: a survey. Mach Learn (2024). https://doi.org/10.1007/s10994-024-06536-9

Download citation

Received: 18 May 2022
Revised: 29 January 2024
Accepted: 05 March 2024
Published: 03 April 2024
DOI: https://doi.org/10.1007/s10994-024-06536-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalization bounds for learning under graph-dependence: a survey

Abstract

Access this article

Similar content being viewed by others

Relative deviation learning bounds and generalization with unbounded loss functions

Global Rademacher Complexity Bounds: From Slow to Fast Convergence Rates

Towards a Framework for Learning from Networked Data

Availability of data and materials

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generalization bounds for learning under graph-dependence: a survey

Abstract

Access this article

Similar content being viewed by others

Relative deviation learning bounds and generalization with unbounded loss functions

Global Rademacher Complexity Bounds: From Slow to Fast Convergence Rates

Towards a Framework for Learning from Networked Data

Availability of data and materials

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation