Abstract
Many applications such as election forecasting, environmental monitoring, health policy, and graph based machine learning require taking expectation of functions defined on the vertices of a graph. We describe a construction of a sampling scheme analogous to the so called Leja points in complex potential theory that can be proved to give low discrepancy estimates for the approximation of the expected value by the impirical expected value based on these points. In contrast to classical potential theory where the kernel is fixed and the equilibrium distribution depends upon the kernel, we fix a probability distribution and construct a kernel (which represents the graph structure) for which the equilibrium distribution is the given probability distribution. Our estimates do not depend upon the size of the graph.
Similar content being viewed by others
References
Andrievskii, V.V., Blatt, H.-P.: Discrepancy of Signed Measures and Polynomial Approximation. Springer Science & Business Media, New York (2013)
Anis, A., Gadde, A., Ortega, A.: Efficient sampling set selection for bandlimited graph signals using graph spectral proxies. IEEE Trans. Signal Process. 64(14), 3775–3789 (2016)
Bermanis, A., Averbuch, A., Coifman, R.R.: Multiscale data sampling and function extension. Appl. Comput. Harmon. Anal. 34(1), 15–29 (2013)
Blatt, H.-P.: On the distribution of simple zeros of polynomials. J. Approx. Theory 69(3), 250–268 (1992)
Blatt, H.-P., Mhaskar, H.N.: A general discrepancy theorem. Arkiv för Matematik 31(2), 219–246 (1993)
Brown, L.: Sequences of well-distributed vertices on graphs and spectral bounds on optimal transport. J. Fourier Anal. Appl. 27(2), 1–27 (2021)
Brualdi, R.A.: The DAD theorem for arbitrary row sums. Proc. Am. Math. Soc. 45(2), 189–194 (1974)
Chen, S., Varma, R., Sandryhaila, A., Kovačević, J.: Discrete signal processing on graphs: sampling theory. IEEE Trans. Signal Process. 63(24), 6510–6523 (2015)
Chung, F.R.K.: Spectral GraphTheory, vol. 92. American Mathematical Soc., Providence (1997)
Cloninger, A., Roy, B., Riley, C., Krumholz, H.M.: People mover’s distance: class level geometry using fast pairwise data adaptive transportation costs. Appl. Comput. Harmon. Anal. 47(1), 248–257 (2019)
De Marchi, S.: On Leja sequences: some results and applications. Appl. Math. Comput. 152(3), 621–647 (2004)
Dick, J., Pillichshammer, F.: Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte Carlo Integration. Cambridge University Press, Cambridge (2010)
Dwivedi, R., Feldheim, O.N., Gurel-Gurevich, O., Ramdas, A.: The power of online thinning in reducing discrepancy. Probab. Theory Relat. Fields 174(1–2), 103–131 (2019)
Erdős, P., Turán, P.: On the uniformly-dense distribution of certain sequences of points. Ann. Math. 41, 162–173 (1940)
Fruchterman, T., Reingold, E.: Graph drawing by force-directed placement. Software 21(11), 1129–1164 (1991)
Fuglede, B.: On the theory of potentials in locally compact spaces. Acta Mathematica 103(3–4), 139–215 (1960)
Gayo-Avello, D.: A meta-analysis of state-of-the-art electoral prediction from twitter data. Soc. Sci. Comput. Rev. 31(6), 649–679 (2013)
Götz, M.: On the distribution of Leja-Górski points. J. Comput. Anal. Appl. 3(3), 223–241 (2001)
Jin, L., Chen, Y., Hui, P., Ding, C., Wang, T., Vasilakos, A. V., Deng, B., Li, X.: Albatross sampling: robust and effective hybrid vertex sampling for social graphs. In: Proceedings of the 3rd ACM international workshop on MobiArch, pp. 11–16 (2011)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: Proceedings of ICLR (2017)
Knight, P.A.: The sinkhorn-knopp algorithm: convergence and applications. SIAM J. Matrix Anal. Appl. 30(1), 261–275 (2008)
Krause, A., Leskovec, J., Guestrin, C., VanBriesen, J., Faloutsos, C.: Efficient sensor placement optimization for securing large water distribution networks. J. Water Resour. Plan. Manag. 134(6), 516–526 (2008)
Kuipers, L., Niederreiter, H.: Uniform Distribution of Sequences. Courier Corporation. Wiley, New York (2012)
Landa, B., Coifman, R.R., Kluger, Y.: Doubly stochastic normalization of the gaussian kernel is robust to heteroskedastic noise. SIAM J. Math. Data Sci. 3(1), 388–413 (2021)
Landkof, N.S.: Foundations of Modern Potential Theory, vol. 180. Springer, New York (1972)
Leja, F.: Sur certaines suites liées aux ensembles plans et leur application à la représentation conforme. Annales Polonici Mathematici 1(4), 8–13 (1957)
Linderman, G., Steinerberger, S.: Numerical integration on graphs: where to sample and how to weigh. Math. Comput. 89(324), 1933–1952 (2020)
Lu, J., Sachs, M., Steinerberger, S.: Quadrature points via heat kernel repulsion. Construct. Approx. 51(1), 27–48 (2020)
Marshall, N.F., Coifman, R.R.: Manifold learning with bi-stochastic kernels. IMA J. Appl. Math. 84(3), 455–482 (2019)
Mhaskar, H.N.: Weighted polynomials, radial basis functions and potentials on locally compact spaces. Numer. Funct Anal Optim. 11(9–10), 987–1017 (1990)
Mhaskar, H.N.: On the tractability of multivariate integration and approximation by neural networks. J. Complex. 20(4), 561–590 (2004)
Mhaskar, H.N.: Dimension independent bounds for general shallow networks. Neural Netw. 123, 142–152 (2020)
Nazi, A., Zhou, Z., Thirumuruganathan, S., Zhang, N., Das, G.: Walk, not wait: faster sampling over online social networks. Proc. VLDB Endow. 8(6), 678–689 (2015)
Ohtsuka, M., et al.: On potentials in locally compact spaces. J. Sci. Hiroshima Univ. Ser. AI (Math.) 25(2), 135–352 (1961)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab (1999)
Pesenson, I.: Sampling in paley-wiener spaces on combinatorial graphs. Trans. Am. Math. Soc. 360(10), 5603–5627 (2008)
Pritsker, I.E.: Equidistribution of points via energy. Arkiv för Matematik 49(1), 149–173 (2011)
Puy, G., Tremblay, N., Gribonval, R., Vandergheynst, P.: Random sampling of bandlimited signals on graphs. Appl. Comput. Harmon. Anal. 44(2), 446–475 (2018)
Sakiyama, A., Tanaka, Y., Tanaka, T., Ortega, A.: Eigendecomposition-free sampling set selection for graph signals. IEEE Trans. Signal Process. 67(10), 2679–2692 (2019)
Sears, L.E., Agrawal, S., Sidney, J.A., Castle, P.H., Rula, E.Y., Coberley, C.R., Witters, D., Pope, J.E., Harter, J.K.: The well-being 5: development and validation of a diagnostic instrument to improve population well-being. Popul. Health Manag. 17(6), 357–365 (2014)
Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag. 29(3), 93–93 (2008)
Sinkhorn, R., Knopp, P.: Concerning nonnegative matrices and doubly stochastic matrices. Pac. J. Math. 21(2), 343–348 (1967)
Smith, L.M., Zhu, L., Lerman, K., Kozareva, Z.: The role of social media in the discussion of controversial topics. In: Proceedings of the 2013 International Conference on Social Computing, pp. 236–243. IEEE (2013)
Steinerberger, S.: Generalized designs on graphs: sampling, spectra, symmetries. J. Graph Theory 93(2), 253–267 (2020)
Tanaka, Y., Eldar, Y.C., Ortega, A., Cheung, G.: Sampling signals on graphs: from theory to applications. IEEE Signal Process. Mag. 37(6), 14–30 (2020)
Vahidian, S., Mirzasoleiman, B., Cloninger, A.: Coresets for estimating means and mean square error with limited greedy samples. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 350–359. PMLR (2020)
Wang, X., Hickernell, F.J.: Randomized halton sequences. Math. Comput. Modell. 32(7–8), 887–899 (2000)
Watts, D.J., Strogatz, S.H.: Collective dynamics of “small-world’’ networks. Nature 393(6684), 440–442 (1998)
Wu, Y., Xu, Y., Singh, A., Yang, Y., Dubrawski, A.: Active learning for graph neural networks via node feature propagation. arXiv:1910.07567 (2019)
Zhu, L., Galstyan, A., Cheng, J., Lerman, K.: Tripartite graph clustering for dynamic sentiment analysis on social media. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pp. 1531–1542 (2014)
Acknowledgements
The work of AC was supported in part by NSF DMS Grants 2012266, 1819222, and Sage Foundation Grant 2196. The work of HNM is supported in part NSF DMS Grant 2012355 and ARO Grant W911NF2110218. We thank Professors Percus at Claremont Graduate University for his help in securing the Proposition data set, which was sent to us by Dr. Linhong Zhu at USC Information Sciences Institute in Marina Del Ray, California.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Isaac Pesenson.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Cloninger, A., Mhaskar, H.N. A Low Discrepancy Sequence on Graphs. J Fourier Anal Appl 27, 76 (2021). https://doi.org/10.1007/s00041-021-09865-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00041-021-09865-8