Toward online node classification on streaming networks

Article

Abstract

The proliferation of networked data in various disciplines motivates a surge of research interests on network or graph mining. Among them, node classification is a typical learning task that focuses on exploiting the node interactions to infer the missing labels of unlabeled nodes in the network. A vast majority of existing node classification algorithms overwhelmingly focus on static networks and they assume the whole network structure is readily available before performing learning algorithms. However, it is not the case in many real-world scenarios where new nodes and new links are continuously being added in the network. Considering the streaming nature of networks, we study how to perform online node classification on this kind of streaming networks (a.k.a. online learning on streaming networks). As the existence of noisy links may negatively affect the node classification performance, we first present an online network embedding algorithm to alleviate this problem by obtaining the embedding representation of new nodes on the fly. Then we feed the learned embedding representation into a novel online soft margin kernel learning algorithm to predict the node labels in a sequential manner. Theoretical analysis is presented to show the superiority of the proposed framework of online learning on streaming networks (OLSN). Extensive experiments on real-world networks further demonstrate the effectiveness and efficiency of the proposed OLSN framework.

Keywords

Online learning Streaming networks Node classification Network embedding 

References

  1. Aggarwal CC, Li N (2011) On node classification in dynamic content-based networks. In: The 11th SIAM international conference on data mining. SIAM, pp 355–366Google Scholar
  2. Ahmed A, Shervashidze N, Narayanamurthy S, Josifovski V, Smola AJ (2013) Distributed large-scale natural graph factorization. In: Proceedings of the 22nd international conference on World Wide Web. ACM, pp 37–48Google Scholar
  3. Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inf Process Syst 14:585–591Google Scholar
  4. Bhagat S, Cormode G, Muthukrishnan S (2011) Node classification in social networks. In: Aggarwal C (ed) Social network data analytics. Springer, pp 115–148Google Scholar
  5. Chang S, Han W, Tang J, Qi GJ, Aggarwal CC, Huang TS (2015) Heterogeneous network embedding via deep architectures. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 119–128Google Scholar
  6. Chang S, Zhang Y, Tang J, Yin D, Chang Y, Hasegawa-Johnson M, Huang TS (2016) Positive-unlabeled learning in streaming networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 755–764Google Scholar
  7. Chen C, Tong H, Xie L, Ying L, He Q (2016) Fascinate: fast cross-layer dependency inference on multi-layered networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 765–774Google Scholar
  8. Crammer K, Singer Y (2002) On the algorithmic implementation of multiclass kernel-based vector machines. J Mach Learn Res 2:265–292MATHGoogle Scholar
  9. Crammer K, Singer Y (2003) Ultraconservative online algorithms for multiclass problems. J Mach Learn Res 3:951–991MATHGoogle Scholar
  10. Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7:551–585MathSciNetMATHGoogle Scholar
  11. Francesco O, Crammer K (2010) New adaptive algorithms for online classification. In: Advances in neural information processing systems, pp 1840–1848Google Scholar
  12. Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 855–864Google Scholar
  13. Gu Q, Han J (2014) Online spectral learning on a graph with bandit feedback. In: Proceedings of the 2014 IEEE international conference on data mining. IEEE, pp 833–838Google Scholar
  14. Guo T, Zhu X, Pei J, Zhang C (2014) Snoc: streaming network node classification. In: Proceedings of the 2014 IEEE international conference on data mining. IEEE, pp 150–159Google Scholar
  15. Hazan E, Kale S (2011) Newtron: an efficient bandit algorithm for online multiclass prediction. In: Advances in neural information processing systems, pp 891–899Google Scholar
  16. Helmbold DP, Kivinen J, Warmuth MK (1999) Relative loss bounds for single neurons. IEEE Trans Neural Netw 10(6):1291–1304CrossRefGoogle Scholar
  17. Huang X, Li J, Hu X (2017a) Accelerated attributed network embedding. In: Proceedings of the 2017 SIAM international conference on data mining. SIAM, pp 633–641Google Scholar
  18. Huang X, Li J, Hu X (2017b) Label informed attributed network embedding. In: Proceedings of the 10th ACM international conference on web search and data mining. ACM, pp 731–739Google Scholar
  19. Jian L, Shen S, Li J, Liang X, Li L (2016) Budget online learning algorithm for least squares svm. IEEE transactions on neural networks and learning systemsGoogle Scholar
  20. Hu X, Tang L, Tang J, Liu H (2013) Exploiting social relations for sentiment analysis in microblogging. In: Proceedings of the 6th ACM international conference on web search and data mining. ACM, pp 537–546Google Scholar
  21. Kakade SM, Shalev-Shwartz S, Tewari A (2008) Efficient bandit algorithms for online multiclass prediction. In: Proceedings of the 25th international conference on machine learning. ACM, pp 440–447Google Scholar
  22. Kivinen J, Smola AJ, Williamson RC (2002) Large margin classification for moving targets. In: International conference on algorithmic learning theory. Springer, pp 113–127Google Scholar
  23. Kivinen J, Smola AJ, Williamson RC (2004) Online learning with kernels. IEEE Trans Signal Process 52(8):2165–2176MathSciNetCrossRefGoogle Scholar
  24. Li L, Tong H (2015) The child is father of the man: Foresee the success at the early stage. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 655–664Google Scholar
  25. Li J, Hu X, Tang J, Liu H (2015a) Unsupervised streaming feature selection in social media. In: Proceedings of the 24th ACM international conference on conference on information and knowledge management. ACM, pp 1041–1050Google Scholar
  26. Li L, Tong H, Xiao Y, Fan W (2015b) Cheetah: fast graph kernel tracking on dynamic graphs. In: Proceedings of the 2015 SIAM international conference on data mining. SIAM, pp 280–288Google Scholar
  27. Li J, Hu X, Wu L, Liu H (2016) Robust unsupervised feature selection on networked data. In: Proceedings of the 2016 SIAM international conference on data mining. SIAMGoogle Scholar
  28. Li J, Wu L, Zaïane OR, Liu H (2017a) Toward personalized relational learning. In: Proceedings of the 2017 SIAM international conference on data mining. SIAM, pp 444–452Google Scholar
  29. Li J, Dani H, Hu X, Tang J, Chang Y, Liu H (2017b) Attributed network embedding for learning in a dynamic environment. arXiv preprint arxiv:1706.01860 Google Scholar
  30. Lu Q, Getoor L (2003) Link-based classification. In: Proceedings of the 20th international conference on machine learning, pp 496–503Google Scholar
  31. Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 701–710Google Scholar
  32. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326CrossRefGoogle Scholar
  33. Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, CambridgeGoogle Scholar
  34. Shalev-Shwartz S, Singer Y, Srebro N, Cotter A (2011) Pegasos: primal estimated sub-gradient solver for svm. Math Program 127(1):3–30MathSciNetCrossRefMATHGoogle Scholar
  35. Singh AP, Gordon GJ (2008) Relational learning via collective matrix factorization. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 650–658Google Scholar
  36. Staiano J, Lepri B, Aharony N, Pianesi F, Sebe N, Pentland A (2012) Friends don’t lie: inferring personality traits from social network structure. In: Proceedings of the 2012 ACM conference on ubiquitous computing. ACM, pp 321–330Google Scholar
  37. Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: large-scale information network embedding. In: Proceedings of the 24th international conference on World Wide Web. ACM, pp 1067–1077Google Scholar
  38. Wang Z, Crammer K, Vucetic S (2012a) Breaking the curse of kernelization: budgeted stochastic gradient descent for large-scale svm training. J Mach Learn Res 13(10):3103–3131MathSciNetMATHGoogle Scholar
  39. Wang J, Zhao P, Hoi SCH (2012b) Exact soft confidence-weighted learning. In: Proceedings of the 29th international conference on machine learning, pp 121–128Google Scholar
  40. Wang J, Zhao P, Hoi SCH (2014) Cost-sensitive online classification. IEEE Trans Knowl Data Eng 26(10):2425–2438CrossRefGoogle Scholar
  41. Wei X, Cao B, Yu PS (2016) Unsupervised feature selection on networks: a generative view. In: Proceedings of the 30th AAAI conference on artificial intelligenceGoogle Scholar
  42. Wilkinson JH, Wilkinson JH (1965) The algebraic eigenvalue problem, vol 87. Clarendon Press, OxfordMATHGoogle Scholar
  43. Xia H, Hoi SC, Jin R, Zhao P (2014) Online multiple kernel similarity learning for visual search. IEEE Trans Pattern Anal Mach Intell 36(3):536–549CrossRefGoogle Scholar
  44. Yan S, Xu D, Zhang B, Zhang HJ, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51CrossRefGoogle Scholar
  45. Yang C, Liu Z, Zhao D, Sun M, Chang EY (2015) Network representation learning with rich text information. In: Proceedings of the 24 international conference on artificial intelligence. IJCAI/AAAI Press, pp 2111–2117Google Scholar
  46. Zhang D, Yin J, Zhu X, Zhang C (2016) Collective classification via discriminative matrix factorization on sparsely labeled networks. In: Proceedings of the 25th ACM international conference on conference on information and knowledge management. ACM, pp 1563–1572Google Scholar
  47. Zhao P, Hoi SCH (2013) Cost-sensitive online active learning with application to malicious url detection. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, pp 919–927Google Scholar
  48. Zhu S, Yu K, Chi Y, Gong Y (2007) Combining content and link for classification using matrix factorization. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 487–494Google Scholar

Copyright information

© The Author(s) 2017

Authors and Affiliations

  1. 1.College of ScienceChina University of PetroleumQingdaoChina
  2. 2.Computer Science and EngineeringArizona State UniversityTempeUSA

Personalised recommendations