Machine Learning

, Volume 106, Issue 2, pp 307–335 | Cite as

Adaptive edge weighting for graph-based learning algorithms

Article
  • 438 Downloads

Abstract

Graph-based learning algorithms including label propagation and spectral clustering are known as the effective state-of-the-art algorithms for a variety of tasks in machine learning applications. Given input data, i.e. feature vectors, graph-based methods typically proceed with the following three steps: (1) generating graph edges, (2) estimating edge weights and (3) running a graph based algorithm. The first and second steps are difficult, especially when there are only a few (or no) labeled instances, while they are important because the performance of graph-based methods heavily depends on the quality of the input graph. For the second step of the three-step procedure, we propose a new method, which optimizes edge weights through a local linear reconstruction error minimization under a constraint that edges are parameterized by a similarity function of node pairs. As a result our generated graph can capture the manifold structure of the input data, where each edge represents similarity of each node pair. To further justify this approach, we also provide analytical considerations for our formulation such as an interpretation as a cross-validation of a propagation model in the feature space, and an error analysis based on a low dimensional manifold model. Experimental results demonstrated the effectiveness of our adaptive edge weighting strategy both in synthetic and real datasets.

Keywords

Graph-based learning Manifold assumption Edge weighting Semi-supervised learning Clustering 

References

  1. Aljabar, P., Wolz, R., & Rueckert, D. (2012). Manifold learning for medical image registration, segmentation, and classification. In Machine learning in computer-aided diagnosis: Medical imaging intelligence and analysis. IGI Global.Google Scholar
  2. Asuncion, A., & Newman, D. J. (2007). UCI machine learning repository. http://www.ics.uci.edu/~mlearn/MLRepository.html.
  3. Bach, F. R., & Jordan, M. I. (2004). Learning spectral clustering. In S. Thrun, L. K. Saul, & B. Schölkopf (Eds.), Advances in neural information processing systems (Vol. 16). Cambridge, MA: MIT Press.Google Scholar
  4. Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7, 2399–2434.MathSciNetMATHGoogle Scholar
  5. Bengio, Y., Delalleau, O., & Le Roux, N. (2006). Label propagation and quadratic criterion. In O. Chapelle, B. Schölkopf, & A. Zien (Eds.), Semi-supervised learning (pp. 193–216). Cambridge, MA: MIT Press.Google Scholar
  6. Blum, A., & Chawla, S. (2001). Learning from labeled and unlabeled data using graph mincuts. In C. E. Brodley & A. P. Danyluk (Eds.), Proceedings of the 18th international conference on machine learning (pp. 19–26). Los Altos, CA: Morgan Kaufmann.Google Scholar
  7. Chapelle, O., Schlkopf, B., & Zien, A. (2010). Semi-supervised learning (1st ed.). Cambridge, MA: The MIT Press.Google Scholar
  8. Chen, J., & Liu, Y. (2011). Locally linear embedding: A survey. Artificial Intelligence Review, 36, 29–48.CrossRefGoogle Scholar
  9. Cheng, H., Liu, Z., & Yang, J. (2009). Sparsity induced similarity measure for label propagation. In IEEE 12th international conference on computer vision (pp. 317–324). Piscataway, NJ: IEEE.Google Scholar
  10. Chung, F. R. K. (1997). Spectral graph theory. Providence, RI: American Mathematical Society.MATHGoogle Scholar
  11. Daitch, S. I., Kelner, J. A., & Spielman, D. A. (2009). Fitting a graph to vector data. In Proceedings of the 26th international conference on machine learning (pp. 201–208). New York, NY: ACM.Google Scholar
  12. Elhamifar, E., & Vidal, R. (2011). Sparse manifold clustering and embedding. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, & K. Weinberger (Eds.), Advances in neural information processing systems (Vol. 24, pp. 55–63).Google Scholar
  13. Fergus, R., Weiss, Y., & Torralba, A. (2009). Semi-supervised learning in gigantic image collections. In Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, & A. Culotta (Eds.), Advances in neural information processing systems (Vol. 22, pp. 522–530). Red Hook, NY: Curran Associates Inc.Google Scholar
  14. Georghiades, A., Belhumeur, P., & Kriegman, D. (2001). From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 643–660.CrossRefGoogle Scholar
  15. Gong, D., Zhao, X., & Medioni, G. G. (2012). Robust multiple manifold structure learning. In The 29th international conference on machine learning. icml.cc/Omnipress.Google Scholar
  16. Graham, D. B., & Allinson, N. M. (1998). Characterizing virtual eigensignatures for general purpose face recognition. In H. Wechsler, P. J. Phillips, V. Bruce, F. Fogelman-Soulie, & T. S. Huang (Eds.), Face recognition: From theory to applications; NATO ASI Series F, computer and systems sciences (Vol. 163, pp. 446–456).Google Scholar
  17. Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B., & Smola, A. J. (2007). A kernel method for the two-sample-problem. In B. Schölkopf, J. C. Platt, & T. Hoffman (Eds.), Advances in neural information processing systems (Vol. 19, pp. 513–520). Cambridge, MA: MIT Press.Google Scholar
  18. Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning: Data mining, inference, and prediction. New York, NY: Springer.CrossRefMATHGoogle Scholar
  19. Herbster, M., Pontil, M., & Wainer, L. (2005). Online learning over graphs. In Proceedings of the 22nd annual international conference on machine learning (pp. 305–312). New York, NY: ACM.Google Scholar
  20. Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.CrossRefMATHGoogle Scholar
  21. Jebara, T., Wang, J., & Chang, S.-F. (2009). Graph construction and b-matching for semi-supervised learning. In A. P. Danyluk, L. Bottou, & M. L. Littman (Eds.), Proceedings of the 26th annual international conference on machine learning (p. 56). New York, NY: ACM.Google Scholar
  22. Jegou, H., Harzallah, H., & Schmid, C. (2007). A contextual dissimilarity measure for accurate and efficient image search. In 2007 IEEE computer society conference on computer vision and pattern recognition. Washington, DC: IEEE Computer Society.Google Scholar
  23. Joachims, T. (2003). Transductive learning via spectral graph partitioning. In T. Fawcett & N. Mishra (Eds.), Machine learning, proceedings of the 20th international conference (pp. 290–297). Menlo Park, CA: AAAI Press.Google Scholar
  24. Kapoor, A., Qi, Y. A., Ahn, H., & Picard, R. (2006). Hyperparameter and kernel learning for graph based semi-supervised classification. In Y. Weiss, B. Schölkopf, & J. Platt (Eds.), Advances in neural information processing systems (Vol. 18, pp. 627–634). Cambridge, MA: MIT Press.Google Scholar
  25. Karasuyama, M., & Mamitsuka, H. (2013). Manifold-based similarity adaptation for label propagation. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Weinberger (Eds.), Advances in neural information processing systems (Vol. 26, pp. 1547–1555). Red Hook, NY: Curran Associates Inc.Google Scholar
  26. Kong, D., Ding, C. H., Huang, H., & Nie, F. (2012). An iterative locally linear embedding algorithm. In J. Langford & J. Pineau (Eds.), Proceedings of the 29th international conference on machine learning (pp. 1647–1654). New York, NY: Omnipress.Google Scholar
  27. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.CrossRefGoogle Scholar
  28. Lee, K.-C., & Kriegman, D. (2005). Online learning of probabilistic appearance manifolds for video-based recognition and tracking. In IEEE conference on computer vision and pattern recognition (pp. 852–859).Google Scholar
  29. Liu, W., He, J., & Chang, S.-F. (2010). Large graph construction for scalable semi-supervised learning. In Proceedings of the 27th international conference on machine learning (pp. 679–686). New York, NY: Omnipress.Google Scholar
  30. Maier, M., von Luxburg, U., & Hein, M. (2009). Influence of graph construction on graph-based clustering measures. In Advances in neural information processing systems (Vol. 21, pp. 1025–1032). Red Hook, NY: Curran Associates, Inc.Google Scholar
  31. Maier, M., von Luxburg, U., & Hein, M. (2013). How the result of graph clustering methods depends on the construction of the graph. ESAIM: Probability and Statistics, 17, 370–418.MathSciNetCrossRefMATHGoogle Scholar
  32. Meila, M., & Shi, J. (2001). A random walks view of spectral segmentation. In T. Jaakkola & T. Richardson (Eds.), Proceedings of the eighth international workshop on artifical intelligence and statistics. Los Altos, CA: Morgan Kaufmann.Google Scholar
  33. Muandet, K., Marukatat, S., & Nattee, C. (2009). Robust graph hyperparameter learning for graph based semi-supervised classification. In 13th Pacific-Asia conference advances in knowledge discovery and data mining (pp. 98–109).Google Scholar
  34. Nene, S. A., Nayar, S. K., & Murase, H. (1996). Columbia object image library (COIL-20). Technical report, Technical Report CUCS-005-96.Google Scholar
  35. Ng, A. Y., Jordan, M. I., & Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. In T. G. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in neural information processing systems (Vol. 14, pp. 849–856). Cambridge, MA: MIT Press.Google Scholar
  36. Patwari, N., & Hero, A. O. (2004). Manifold learning algorithms for localization in wireless sensor networks. In IEEE international conference on acoustics, speech, and signal processing (Vol. 3, pp. iii–857–60).Google Scholar
  37. Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336), 846–850.CrossRefGoogle Scholar
  38. Roweis, S., & Saul, L. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.CrossRefGoogle Scholar
  39. Samaria, F., & Harter, A. (1994). Parameterisation of a stochastic model for human face identification. In Proceedings of the second IEEE workshop on applications of computer vision (pp. 138–142).Google Scholar
  40. Saul, L. K., & Roweis, S. T. (2003). Think globally, fit locally: Unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research, 4, 119–155.MathSciNetMATHGoogle Scholar
  41. Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.CrossRefGoogle Scholar
  42. Sindhwani, V., Niyogi, P., & Belkin, M. (2005). Beyond the point cloud: From transductive to semi-supervised learning. In L. D. Raedt & S. Wrobel (Eds.), Proceedings of the 22nd international conference on machine learning (pp. 824–831). New York, NY: ACM.Google Scholar
  43. Spielman, D. A., & Teng, S.-H. (2004). Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In L. Babai (Ed.), Proceedings of the 36th annual ACM symposium on theory of computing. New York, NY: ACM.Google Scholar
  44. Subramanya, A., & Bilmes, J. A. (2011). Semi-supervised learning with measure propagation. Journal of Machine Learning Research, 12, 3311–3370.MathSciNetMATHGoogle Scholar
  45. Szummer, M., & Jaakkola, T. (2001). Partially labeled classification with Markov random walks. In T. G. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in neural information processing systems (Vol. 14, pp. 945–952). Cambridge, MA: MIT Press.Google Scholar
  46. Talukdar, P. P. (2009). Topics in graph construction for semi-supervised learning. Technical report MS-CIS-09-13, University of Pennsylvania, Department of Computer and Information Science.Google Scholar
  47. von Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and Computing, 17(4), 395–416.MathSciNetCrossRefGoogle Scholar
  48. von Luxburg, U., Belkin, M., & Bousquet, O. (2008). Consistency of spectral clustering. Annals of Statistics, 36(2), 555–586.MathSciNetCrossRefMATHGoogle Scholar
  49. Wang, F., & Zhang, C. (2008). Label propagation through linear neighborhoods. IEEE Transactions on Knowledge and Data Engineering, 20, 55–67.CrossRefGoogle Scholar
  50. Wang, G., Yeung, D.-Y., & Lochovsky, F. H. (2008). A new solution path algorithm in support vector regression. IEEE Transactions on Neural Networks, 19(10), 1753–1767.CrossRefGoogle Scholar
  51. Wang, Y., Jiang, Y., Wu, Y., & Zhou, Z. H. (2011). Spectral clustering on multiple manifolds. IEEE Transactions on Neural Networks, 22(7), 1149–1161.CrossRefGoogle Scholar
  52. Zelnik-Manor, L., & Perona, P. (2004). Self-tuning spectral clustering. In Advances in neural information processing systems (Vol. 17, pp. 1601–1608). Cambridge, MA: MIT Press.Google Scholar
  53. Zhang, X., & Lee, W. S. (2007). Hyperparameter learning for graph based semi-supervised learning algorithms. In B. Schölkopf, J. Platt, & T. Hoffman (Eds.), Advances in neural information processing systems (Vol. 19, pp. 1585–1592). Cambridge, MA: MIT Press.Google Scholar
  54. Zhang, Z., & Zha, H. (2005). Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM Journal on Scientific Computing, 26(1), 313–338.MathSciNetCrossRefMATHGoogle Scholar
  55. Zhou, D., Bousquet, O., Lal, T. N., Weston, J., & Schölkopf, B. (2004). Learning with local and global consistency. In S. Thrun, L. Saul, & B. Schölkopf (Eds.), Advances in neural information processing systems (Vol. 16). Cambridge, MA: MIT Press.Google Scholar
  56. Zhu, X., Ghahramani, Z., & Lafferty, J. D. (2003). Semi-supervised learning using gaussian fields and harmonic functions. In T. Fawcett & N. Mishra (Eds.), Proceedings of the twentieth international conference on machine learning (pp. 912–919). Menlo Park, CA: AAAI Press.Google Scholar
  57. Zhu, X., Kandola, J., Ghahramani, Z., & Lafferty, J. (2005). Nonparametric transforms of graph kernels for semi-supervised learning. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in neural information processing systems (Vol. 17, pp. 1641–1648). Cambridge, MA: MIT Press.Google Scholar

Copyright information

© The Author(s) 2016

Authors and Affiliations

  1. 1.Department of EngineeringNagoya Institute of TechnologyNagoyaJapan
  2. 2.Bioinformatics Center, Institute for Chemical ResearchKyoto UniversityUjiJapan
  3. 3.Department of Computer ScienceAalto UniversityEspooFinland

Personalised recommendations