Advertisement

Advances in Computational Mathematics

, Volume 45, Issue 3, pp 1439–1468 | Cite as

The empirical Christoffel function with applications in data analysis

  • Jean B. Lasserre
  • Edouard PauwelsEmail author
Article
  • 56 Downloads

Abstract

We illustrate the potential applications in machine learning of the Christoffel function, or, more precisely, its empirical counterpart associated with a counting measure uniformly supported on a finite set of points. Firstly, we provide a thresholding scheme which allows approximating the support of a measure from a finite subset of its moments with strong asymptotic guaranties. Secondly, we provide a consistency result which relates the empirical Christoffel function and its population counterpart in the limit of large samples. Finally, we illustrate the relevance of our results on simulated and real-world datasets for several applications in statistics and machine learning: (a) density and support estimation from finite samples, (b) outlier and novelty detection, and (c) affine matching.

Keywords

Christoffel function Statistics Support inference Density estimation Consistency 

Mathematics Subject Classification (2010)

62-07 62H99 68T05 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

Acknowledgements

The research of the first author was funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement 666981 TAMING).

References

  1. 1.
    Aaron, C., Bodart, O.: Local convex hull support and boundary estimation. J. Multivar. Anal. 147, 82–101 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Ash, R.B.: Real Analysis and Probability. Academic Press Harcourt Brace Jovanovich, Publishers, Boston (1972)zbMATHGoogle Scholar
  3. 3.
    Baíllo, A., Cuevas, A., Justel, A.: Set estimation and nonparametric detection. Can. J. Stat. 28(4), 765–782 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Basu, S., Pollack, R., Roy, M.F.: Computing the first Betti number and the connected components of semi-algebraic sets. In: Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pp. 304–312 (2005)Google Scholar
  5. 5.
    Berman, R.J.: Bergman kernels for weighted polynomials and weighted equilibrium measures of \(\mathbb {C}^{n}\). Indiana University Mathematics Journal 58(4), 1921–1946 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Bos, L.: Asymptotics for the Christoffel function for Jacobi like weights on a ball in \(\mathbb {R}\mathbb {R}^{m}\). New Zealand Journal of Mathematics 23(99), 109–116 (1994)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Bos, L., Della Vecchia, B., Mastroianni, G.: On the asymptotics of Christoffel functions for centrally symmetric weight functions on the ball in \(\mathbb {R}\mathbb {R}^{d}\). Rendiconti del Circolo Matematico di Palermo 2(52), 277–290 (1998)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Chevalier, J.: Estimation du Support et du Contour du Support d’une Loi de probabilité. Annales de l’Institut Henri poincaré, Section B 12(4), 339–364 (1976)zbMATHGoogle Scholar
  9. 9.
    Cholaquidis, A., Cuevas, A., Fraiman, R.: On poincaré cone property. Ann. Stat. 42(1), 255–284 (2014)CrossRefzbMATHGoogle Scholar
  10. 10.
    Coste, M.: An introduction to semialgebraic geometry. Istituti Editoriali e Poligrafici Internazionali (2000)Google Scholar
  11. 11.
    Cuevas, A., Fraiman, R.: A plug-in approach to support estimation. Ann. Stat. 25, 2300–2312 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Cuevas, A., González-Manteiga, W., Rodríguez-casal, A.: Plug-in estimation of general level sets. Aust. N. Z. J. Stat. 48(1), 7–19 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Davis, J., Goadrich, M.: The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd international conference on Machine learning, pp. 233-240. ACM (2006)Google Scholar
  14. 14.
    De Marchi, S., Sommariva, A.: M. Vianello Multivariate Christoffel functions and hyperinterpolation. Dolomites Research Notes on Approximation 7, 26–3 (2014)Google Scholar
  15. 15.
    Devroye, L., Wise, G.L.: Detection of abnormal behavior via nonparametric estimation of the support. SIAM J. Appl. Math. 38(3), 480–488 (1980)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Dunkl, C.F., Xu, Y.: Orthogonal Polynomials of Several Variables. Cambridge University Press, Cambridge (2001)CrossRefzbMATHGoogle Scholar
  17. 17.
    Geffroy, J.: Sur un problème d’estimation géométrique. Publications de l’Institut de Statistique des Universités de Paris 13, 191–210 (1964)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Gustafsson, B., Putinar, M., Saff, E., Stylianopoulos, N.: Bergman polynomials on an archipelago: estimates, zeros and shape reconstruction. Adv. Math. 222(4), 1405–1460 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Hardle, W., Park, B.U., Tsybakov, A.B.: Estimation of non-sharp support boundaries. J. Multivar. Anal. 55(2), 205–218 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Helton, J.W., Lasserre, J.B., Putinar, M.: Measures with zeros in the inverse of their moment matrix. Ann. Probab. 36(4), 1453–1471 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Kroò, A., Lubinsky, D.S.: Christoffel functions and universality in the bulk for multivariate orthogonal polynomials. Can. J. Math. 65(3), 600620 (2012)MathSciNetGoogle Scholar
  22. 22.
    Kroó, A., Lubinsky, D.S.: Christoffel functions and universality on the boundary of the ball. Acta Math. Hungar. 140, 117–133 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Lasserre, J.B., Pauwels, E.: Sorting out typicality with the inverse moment matrix SOS polynomial. In: Proceedings of the 30-th Conference on Advances in Neural Information Processing Systems (2016)Google Scholar
  24. 24.
    Lichman, M.: UCI Machine Learning Repository, http://archive.ics.uci.edu/ml University of California, Irvine, School of Information and Computer Sciences (2013)
  25. 25.
    Malyshkin, V.G.: Multiple Instance Learning: Christoffel Function Approach to Distribution Regression Problem. arXiv:1511.07085 (2015)
  26. 26.
    Mammen, E., Tsybakov, A.B.: Asymptotical minimax recovery of sets with smooth boundaries. Ann. Stat. 23(2), 502–524 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Máté, A., Nevai, P.: Bernstein’s Inequality in L p for 0 < p < 1 and (C, 1) Bounds for Orthogonal Polynomials. Ann. Math. 111(1), 145–154 (1980)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Máté, A., Nevai, P., Totik, V.: Szegö’s extremum problem on the unit circle. Ann. Math. 134(2), 433–453 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Molchanov, I.S.: A limit theorem for solutions of inequalities. Scand. J. Stat. 25(1), 235–242 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Nevai, P.: Géza Freud, orthogonal polynomials and Christoffel functions. A case study. Journal of Approximation Theory 48(1), 3–167 (1986)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Patschkowski, T., Rohde, A.: Adaptation to lowest density regions with application to support recovery. Ann. Stat. 44(1), 255–287 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Polonik, W.: Measuring mass concentrations and estimating density contour clusters, an excess mass approach. Ann. Stat. 23(3), 855–881 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Rényi, A., Sulanke, R.: ÜBer die konvexe hülle von n zufällig gewählten Punkten. Probab. Theory Relat. Fields 2(1), 75–84 (1963)zbMATHGoogle Scholar
  35. 35.
    Rigollet, P., Vert, R.: Optimal rates for plug-in estimators of density level sets. Bernoulli 15(4), 1154–1178 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Robbins, H.: A remark on Stirling’s formula. Am. Math. Mon. 62(1), 26–29 (1955)MathSciNetzbMATHGoogle Scholar
  37. 37.
    Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 27(3), 832–837 (1956)MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A., Williamson, R.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)CrossRefzbMATHGoogle Scholar
  39. 39.
    Singh, A., Scott, C., Nowak, R.: Adaptive Hausdorff estimation of density level sets. Ann. Stat. 37(5B), 2760–2782 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Szegö, G.: Orthogonal polynomials. In: Colloquium publications, AMS, (23), fourth edition (1974)Google Scholar
  41. 41.
    Totik, V.: Asymptotics for Christoffel functions for general measures on the real line. Journal d’Analyse Mathématique 81(1), 283–303 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  42. 42.
    Tsybakov, A.B.: On nonparametric estimation of density level sets. Ann. Stat. 25(3), 948–969 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  43. 43.
    Williams, G., Baxter, R., He, H., Hawkins, S., Gu, L.: A comparative study of RNN for outlier detection in data mining. In: IEEE International Conference on Data Mining (p. 709). IEEE Computer Society (2002)Google Scholar
  44. 44.
    Xu, Y.: Christoffel functions and Fourier series for multivariate orthogonal polynomials. Journal of Approximation Theory 82(2), 205–239 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  45. 45.
    Xu, Y.: Asymptotics for orthogonal polynomials and Christoffel functions on a ball. Methods Appl. Anal. 3, 257–272 (1996)MathSciNetzbMATHGoogle Scholar
  46. 46.
    Xu, Y.: Asymptotics of the christoffel functions on a simplex in \(\mathbb {R}\mathbb {R}^{d}\). Journal of Approximation Theory 99(1), 122–133 (1999)MathSciNetCrossRefGoogle Scholar
  47. 47.
    Zeileis, A., Hornik, K., Smola, A., Karatzoglou, A.: Kernlab-an S4 package for kernel methods in R. J. Stat. Softw. 11(9), 1–20 (2004)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.LAAS-CNRS and Institute of MathematicsUniversity of ToulouseToulouseFrance
  2. 2.IRITUniversité Toulouse 3 Paul SabatierToulouseFrance

Personalised recommendations