Neural Processing Letters

, Volume 48, Issue 2, pp 669–689 | Cite as

Time Series Prediction for Graphs in Kernel and Dissimilarity Spaces

  • Benjamin PaaßenEmail author
  • Christina Göpfert
  • Barbara Hammer


Graphs are a flexible and general formalism providing rich models in various important domains, such as distributed computing, intelligent tutoring systems or social network analysis. In many cases, such models need to take changes in the graph structure into account, that is, changes in the number of nodes or in the graph connectivity. Predicting such changes within graphs can be expected to yield important insight with respect to the underlying dynamics, e.g. with respect to user behaviour. However, predictive techniques in the past have almost exclusively focused on single edges or nodes. In this contribution, we attempt to predict the future state of a graph as a whole. We propose to phrase time series prediction as a regression problem and apply dissimilarity- or kernel-based regression techniques, such as 1-nearest neighbor, kernel regression and Gaussian process regression, which can be applied to graphs via graph kernels. The output of the regression is a point embedded in a pseudo-Euclidean space, which can be analyzed using subsequent dissimilarity- or kernel-based processing methods. We discuss strategies to speed up Gaussian processes regression from cubic to linear time and evaluate our approach on two well-established theoretical models of graph evolution as well as two real data sets from the domain of intelligent tutoring systems. We find that simple regression methods, such as kernel regression, are sufficient to capture the dynamics in the theoretical models, but that Gaussian process regression significantly improves the prediction error for real-world data.


Structured data Graphs Time series prediction Gaussian processes Kernel space 


  1. 1.
    Adamatzky A (2002) Collision-based computing. Springer, New YorkCrossRefGoogle Scholar
  2. 2.
    Ahmad A, Hassan M, Abdullah M, Rahman H, Hussin F, Abdullah H, Saidur R (2014) A review on applications of ANN and SVM for building electrical energy consumption forecasting. Renew Sustain Energy Rev 33:102–109CrossRefGoogle Scholar
  3. 3.
    Aiolli F, Martino GDS, Sperduti A (2015) An efficient topological distance-based tree kernel. IEEE Trans Neural Netw Learn Syst 26(5):1115–1120MathSciNetCrossRefGoogle Scholar
  4. 4.
    Bakir GH, Weston J, Schölkopf B (2003) Learning to find pre-images. In: Proceedings of the 16th international conference on neural information processing systems, NIPS’03. MIT Press, Cambridge, pp 449–456Google Scholar
  5. 5.
    Bakir GH, Zien A, Tsuda K (2004) Learning to find graph pre-images. In: Pattern recognition. Springer, Berlin, pp 253–261Google Scholar
  6. 6.
    Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512MathSciNetCrossRefGoogle Scholar
  7. 7.
    Barnes T, Stamper J (2008) Toward automatic hint generation for logic proof tutoring using historical student data. In: Woolf BP, Aïmeur E, Nkambou R, Lajoie S (eds) Intelligent tutoring systems, vol 5091. Lecture notes in computer science. Springer, Berlin, pp 373–382CrossRefGoogle Scholar
  8. 8.
    Barrett CL, Mortveit H, Reidys CM (2000) Elements of a theory of simulation II: sequential dynamical systems. Appl Math Comput 107(2–3):121–136MathSciNetzbMATHGoogle Scholar
  9. 9.
    Barrett CL, Mortveit HS, Reidys CM (2003) ETS IV: sequential dynamical systems: fixed points, invertibility and equivalence. Appl Math Comput 134(1):153–171MathSciNetzbMATHGoogle Scholar
  10. 10.
    Barrett CL, Reidys CM (1999) Elements of a theory of computer simulation I: sequential ca over random graphs. Appl Math Comput 98(2–3):241–259MathSciNetzbMATHGoogle Scholar
  11. 11.
    Bellet A, Habrard A, Sebban M (2013) A survey on metric learning for feature vectors and structured data. CoRR, abs/1306.6709Google Scholar
  12. 12.
    Borgwardt, KM, Kriegel HP (2005) Shortest-path kernels on graphs. In: Fifth IEEE international conference on data mining (ICDM’05)Google Scholar
  13. 13.
    Casteigts A, Flocchini P, Quattrociocchi W, Santoro N (2012) Time-varying graphs and dynamic networks. Int J Parallel Emerg Distrib Syst 27(5):387–408CrossRefGoogle Scholar
  14. 14.
    Clauset A (2013) Generative models for complex network structure. In: NetSci 2013—complex networks meets machine learningGoogle Scholar
  15. 15.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297zbMATHGoogle Scholar
  16. 16.
    Da San Martino G, Navarin N, Sperduti A (2016) Ordered decompositional DAG kernels enhancements. Neurocomputing 192:92–103. arXiv:1507.03372 CrossRefGoogle Scholar
  17. 17.
    Da San Martino G, Sperduti A (2010) Mining structured data. Comput Intell Mag IEEE 5(1):42–49CrossRefGoogle Scholar
  18. 18.
    Deisenroth, MP, Ng JW (2015) Distributed gaussian processes. In: Proceedings of the 32nd international conference on machine learning, ICML 2015, Lille, France, 6–11 2015, pp 1481–1490Google Scholar
  19. 19.
    Feragen A, Kasenburg N, Petersen J, de Bruijne M, Borgwardt K (2013) Scalable kernels for graphs with continuous attributes. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., New York, pp 216–224Google Scholar
  20. 20.
    Filippone M, Camastra F, Masulli F, Rovetta S (2008) A survey of kernel and spectral methods for clustering. Pattern Recogn 41(1):176–190CrossRefGoogle Scholar
  21. 21.
    Floyd RW (1962) Algorithm 97: shortest path. Commun ACM 5(6):345–345CrossRefGoogle Scholar
  22. 22.
    Gao X, Xiao B, Tao D, Li X (2010) A survey of graph edit distance. Pattern Anal Appl 13(1):113–129MathSciNetCrossRefGoogle Scholar
  23. 23.
    Gardner M (1970) Mathematical games—the fantastic combinations of John Conway’s new solitaire game “life”. Sci Am 223:120–123CrossRefGoogle Scholar
  24. 24.
    Girard A, Rasmussen CE, Candela JQ, Murray-Smith R (2003) Gaussian process priors with uncertain inputs-application to multiple-step ahead time series forecasting. In: Advances in neural information processing systems, pp 545–552Google Scholar
  25. 25.
    Gisbrecht A, Schleif F-M (2015) Metric and non-metric proximity transformations at linear costs. Neurocomputing 167:643–657CrossRefGoogle Scholar
  26. 26.
    Goldenberg A, Zheng AX, Fienberg SE, Airoldi EM (2010) A survey of statistical network models. Found Trends Mach Learn 2(2):129–233CrossRefGoogle Scholar
  27. 27.
    Gärtner T, Flach, P, Wrobel S (2003) On graph kernels: hardness results and efficient alternatives. In: Conference on learning theory, pp 129–143CrossRefGoogle Scholar
  28. 28.
    Hammer B, Hasenfuss A (2010) Topographic mapping of large dissimilarity data sets. Neural Comput 22(9):2229–2284MathSciNetCrossRefGoogle Scholar
  29. 29.
    Hammer B, Hofmann D, Schleif F-M, Zhu X (2014) Learning vector quantization for (dis-)similarities. Neurocomputing 131:43–51CrossRefGoogle Scholar
  30. 30.
    Haussler D (1999) Convolution kernels on discrete structures. Technical Report UCS-CRL-99-10, University of California at Santa Cruz, Santa Cruz, CA, USAGoogle Scholar
  31. 31.
    Hoff PD, Raftery AE, Handcock MS (2002) Latent space approaches to social network analysis. J Am Stat Assoc 97(460):1090–1098MathSciNetCrossRefGoogle Scholar
  32. 32.
    Hofmann Hammer B (2012) Kernel robust soft learning vector quantization. Springer, BerlinCrossRefGoogle Scholar
  33. 33.
    Hofmann D, Schleif F-M, Paaßen B, Hammer B (2014) Learning interpretable kernelized prototype-based models. Neurocomputing 141:84–96CrossRefGoogle Scholar
  34. 34.
    Holland PW, Laskey KB, Leinhardt S (1983) Stochastic block models: first steps. Soc Netw 5:109–137CrossRefGoogle Scholar
  35. 35.
    Koedinger KR, Brunskill E, Baker RS, McLaughlin EA, Stamper J (2013) New potentials for data-driven intelligent tutoring system development and optimization. AI Mag 34(3):27–41CrossRefGoogle Scholar
  36. 36.
    Kwok J-Y, Tsang I (2004) The pre-image problem in kernel methods. Neural Netw IEEE Trans 15(6):1517–1525CrossRefGoogle Scholar
  37. 37.
    Lö L, Zhou T (2011) Link prediction in complex networks: a survey. Physica A 390(6):1150–1170CrossRefGoogle Scholar
  38. 38.
    Levenshtein VI (1965) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys Doklady 10(8):707–710MathSciNetzbMATHGoogle Scholar
  39. 39.
    Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7):1019–1031CrossRefGoogle Scholar
  40. 40.
    Lichtenwalter RN, Lussier JT, Chawla (2010) NV New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10, ACM, New York, NY, pp 243–252Google Scholar
  41. 41.
    Mokbel B, Gross S, Paaßen B, Pinkwart N, Hammer B (2013) Domain-independent proximity measures in intelligent tutoring systems. In: Proceedings of the 6th international conference on educational data mining (EDM), pp 334–335Google Scholar
  42. 42.
    Mokbel B, Paaßen B, Schleif F-M, Hammer B (2015) Metric learning for sequences in relational LVQ. Neurocomputing 169:306–322CrossRefGoogle Scholar
  43. 43.
    Nadaraya EA (1964) On estimating regression. Theory Probab Appl 9(1):141–142CrossRefGoogle Scholar
  44. 44.
    Paaßen B, Göpfert C, Hammer B (2016) Gaussian process prediction for time series of structured data. In: Verleysen M (ed), 24th European symposium on artificial neural networks, computational intelligence and machine learning (ESANN),, pp 41–46Google Scholar
  45. 45.
    Paaßen B, Mokbel B, Hammer B (2016) Adaptive structure metrics for automated feedback provision in intelligent tutoring systems. Neurocomputing 192:3–13CrossRefGoogle Scholar
  46. 46.
    Papageorgiou M (1990) Dynamic modeling, assignment, and route guidance in traffic networks. Transp Res Part B Methodol 24(6):471–495CrossRefGoogle Scholar
  47. 47.
    Pękalska E (2005) The dissimilarity representation for pattern recognition: foundations and applications. Ph.D. thesis, Delft University of TechnologyGoogle Scholar
  48. 48.
    Rasmussen CE, Williams CKI (2005) Gaussian processes for machine learning (adaptive computation and machine learning). The MIT Press, CambridgeGoogle Scholar
  49. 49.
    Roberts S, Osborne M, Ebden M, Reece S, Gibson N, Aigrain S (1984) Gaussian processes for time-series modelling. Philos Trans R Soc Lond A Math Phys Eng Sci 371:2012zbMATHGoogle Scholar
  50. 50.
    Robles-Kelly A, Hancock ER (2003) Edit distance from graph spectra. Proc Ninth IEEE Int Conf Comput Vis 1:234–241CrossRefGoogle Scholar
  51. 51.
    Robles-Kelly A, Hancock ER (2005) Graph edit distance from spectral seriation. IEEE Trans Pattern Anal Mach Intell 27(3):365–378CrossRefGoogle Scholar
  52. 52.
    Sanfeliu A, Fu KS (1983) A distance measure between attributed relational graphs for pattern recognition. IEEE Trans Syst Man Cybern SMC 13(3):353–362CrossRefGoogle Scholar
  53. 53.
    Sapankevych NI, Sankar R (2009) Time series prediction using support vector machines: a survey. IEEE Comput Intell Mag 4(2):24–38CrossRefGoogle Scholar
  54. 54.
    Scherrer A, Borgnat P, Fleury E, Guillaume J-L, Robardet C (2008) Description and simulation of dynamic mobility networks. Comput Netw 52(15):2842–2858CrossRefGoogle Scholar
  55. 55.
    Schleif F-M, Tino P, Liang Y (2016) Learning in indefinite proximity spaces-recent trends. In: Verleysen M (ed) 24th European symposium on artificial neural networks, computational intelligence and machine learning (ESANN),, pp 113–122Google Scholar
  56. 56.
    Shervashidze N, Schweitzer P, Leeuwen EJV, Mehlhorn K, Borgwardt KM (2011) Weisfeiler–Lehman graph kernels. J Mach Learn Res 12:2539–2561MathSciNetzbMATHGoogle Scholar
  57. 57.
    Shumway RH, Stoffer DS (2013) Time series analysis and its applications. Springer, New YorkzbMATHGoogle Scholar
  58. 58.
    Wang J, Hertzmann A, Blei DM (2005) Gaussian process dynamical models. In: Advances in neural information processing systems, pp 1441–1448Google Scholar
  59. 59.
    Yang Y, Lichtenwalter RN, Chawla NV (2015) Evaluating link prediction methods. Knowl Inf Syst 45(3):751–782CrossRefGoogle Scholar
  60. 60.
    Zeng Z, Tung AKH, Wang J, Feng J, Zhou L (2009) Comparing stars: on approximating graph edit distance. Proc VLDB Endow 2(1):25–36CrossRefGoogle Scholar
  61. 61.
    Zhang K, Shasha D (1989) Simple fast algorithms for the editing distance between trees and related problems. SIAM J Comput 18(6):1245–1262MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Cognitive Interaction TechnologyBielefeldGermany

Personalised recommendations