Advertisement

Data Mining and Knowledge Discovery

, Volume 33, Issue 1, pp 230–251 | Cite as

Robust classification of graph-based data

  • Carlos M. AlaízEmail author
  • Michaël Fanuel
  • Johan A. K. Suykens
Article
  • 89 Downloads

Abstract

A graph-based classification method is proposed for both semi-supervised learning in the case of Euclidean data and classification in the case of graph data. Our manifold learning technique is based on a convex optimization problem involving a convex quadratic regularization term and a concave quadratic loss function with a trade-off parameter carefully chosen so that the objective function remains convex. As shown empirically, the advantage of considering a concave loss function is that the learning problem becomes more robust in the presence of noisy labels. Furthermore, the loss function considered here is then more similar to a classification loss while several other methods treat graph-based classification problems as regression problems.

Keywords

Classification Graph data Semi-supervised learning 

Notes

Acknowledgements

The authors would like to thank the following organizations. EU: The research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) / ERC AdG A-DATADRIVE-B (290923). This paper reflects only the authors’ views, the Union is not liable for any use that may be made of the contained information. Research Council KUL: GOA/10/09 MaNet, CoE PFV/10/002 (OPTEC), BIL12/11T; Ph.D./Postdoc Grants. Flemish Government: FWO: G.0377.12 (Structured systems), G.088114N (Tensor based data similarity); Ph.D./Postdoc Grants. IWT: SBO POM (100031); Ph.D./Postdoc Grants. iMinds Medical Information Technologies SBO 2014. Belgian Federal Science Policy Office: IUAP P7/19 (DYSCO, Dynamical systems, control and optimization, 2012–2017). Fundación BBVA: project FACIL–Ayudas Fundación BBVA a Equipos de Investigación Científica 2016. UAM–ADIC Chair for Data Science and Machine Learning. Concerted Research Action (ARC) programme supported by the Federation Wallonia-Brussels (contract ARC 14/19-060 on Mining and Optimization of Big Data Models).

References

  1. Adamic LA, Glance N (2005) The political blogosphere and the 2004 U.S. election: divided they blog. In: Proceedings of the 3rd international workshop on link discovery, ACM, New York, NY, USA, LinkKDD’05, pp 36–43Google Scholar
  2. Alaíz CM, Fanuel M, Suykens JAK (2018) Convex formulation for kernel PCA and its use in semisupervised learning. IEEE Trans Neural Netw Learn Syst 29(8):3863–3869MathSciNetCrossRefGoogle Scholar
  3. Belkin M, Niyogi P (2004) Semi-supervised learning on riemannian manifolds. Mach Learn 56(1):209–239CrossRefzbMATHGoogle Scholar
  4. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434MathSciNetzbMATHGoogle Scholar
  5. Chapelle O, Schölkopf B, Zien A (2010) Semi-supervised learning, 1st edn. The MIT Press, CambridgeGoogle Scholar
  6. Chung FR (1997) Spectral graph theory, vol 92. American Mathematical Society, ProvidencezbMATHGoogle Scholar
  7. Coifman RR, Lafon S (2006) Diffusion maps. Appl Comput Harmon Anal 21(1):5–30 (special issue: Diffusion Maps and Wavelets)MathSciNetCrossRefzbMATHGoogle Scholar
  8. Gleich DF, Mahoney MW (2015) Using local spectral methods to robustify graph-based learning algorithms. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, KDD’15, pp 359–368Google Scholar
  9. Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Pattern Anal Mach Intell 16(5):550–554CrossRefGoogle Scholar
  10. Joachims T (2003) Transductive learning via spectral graphpartitioning. In: Proceedings of the 20th international conference on international conference on machine learning, AAAI Press, ICML’03, pp 290–297Google Scholar
  11. Liu T, Tao D (2016) Classification with noisy labels by importance reweighting. IEEE Trans Pattern Anal Mach Intell 38(3):447–461CrossRefGoogle Scholar
  12. Melacci S, Belkin M (2011) Laplacian support vector machines trained in the primal. J Mach Learn Res 12:1149–1184MathSciNetzbMATHGoogle Scholar
  13. Natarajan N, Dhillon IS, Ravikumar PK, Tewari A (2013) Learning with noisy labels. In: Advances in neural information processing systems, MIT Press, pp 1196–1204Google Scholar
  14. Vahdat A (2017) Toward robustness against label noise in trainingdeep discriminative neural networks. In: Advances in neural information processing systems, MIT Press, pp 5596–5605Google Scholar
  15. Yang Z, Cohen W, Salakhudinov R (2016) Revisiting semi-supervised learning with graph embeddings. In: International conference on machine learning, pp 40–48Google Scholar
  16. Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473CrossRefGoogle Scholar
  17. Zhou D, Bousquet O, Lal TN, Weston J, Schölkopf B (2004) Learning with local and global consistency. In: Advances in neural information processing systems, MIT Press, pp 321–328Google Scholar
  18. Zhou X, Belkin M, Srebro N (2011) An iterated graph laplacian approach for ranking on manifolds. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, KDD’11, pp 877–885Google Scholar
  19. Zhu X, Ghahramani Z, Lafferty JD (2003) Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the 20th international conference on international conference on machine learning, AAAI Press, ICML’03, pp 912–919Google Scholar

Copyright information

© The Author(s) 2018

Authors and Affiliations

  • Carlos M. Alaíz
    • 1
    Email author
  • Michaël Fanuel
    • 2
  • Johan A. K. Suykens
    • 2
  1. 1.Departamento de Ingeniería InformáticaUniversidad Autónoma de MadridMadridSpain
  2. 2.Department of Electrical Engineering (ESAT–STADIUS)KU LeuvenLouvainBelgium

Personalised recommendations