Skip to main content

GA-TVRC-Het: genetic algorithm enhanced time varying relational classifier for evolving heterogeneous networks

Abstract

Evolving heterogeneous networks, which contain different types of nodes and links that change over time, appear in many domains including protein–protein interactions, scientific collaborations, telecommunications. In this paper, we aim to discover temporal information from a heterogenous evolving network in order to improve node classification. We propose a framework, Genetic Algorithm enhanced Time Varying Relational Classifier for evolving Heterogeneous Networks (GA-TVRC-Het), to extract the effects of different relationship types in different time periods in the past. These effects are discovered adaptively by utilizing genetic algorithms. A relational classifier is extended as the classification method in order to be able to work with different types of nodes. The proposed framework is tested on two real world data sets. It is shown that using the optimal time effect improves the classification performance to a large extent. It is observed that the optimal time effect does not necessarily follow a certain functional trend, for example linear or exponential decay in time. Another observation is that the optimal time effect may be different for each type of interaction. Both observations reveal the reason why GA-TVRC-Het outperforms methods that rely on a predefined form of time effect or the same time effect for each link type.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

References

  1. Acar E, Dunlavy DM, Kolda TG (2009) Link prediction on evolving data using matrix and tensor factorizations. In: ICDM workshops. IEEE Computer Society, New York, pp 262–269

  2. Aggarwal C, Li N (2011) On node classification in dynamic content-based networks. SDM, SIAM/Omnipress, Eastbourne, pp 355–366

  3. Albert R, Jeong H, Barabási AL (1999) Diameter of the world wide web. Nature 401:130–131

    Article  Google Scholar 

  4. Atak AA, Öğüdücü SG (2010) A framework for social spam detection based on relational bayes classifier. DMIN, CSREA Press, Las Vegas Nevada

    Google Scholar 

  5. Barabasi AL, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512

    Article  MathSciNet  Google Scholar 

  6. Barabasi AL, Jeong H, Neda Z, Ravasz E, Schubert A, Vicsek T (2002) Evolution of the social network of scientific collaborations. Phys A 311(3–4):590–614

    Article  MATH  MathSciNet  Google Scholar 

  7. Beyer HG, Schwefel HP (2002) Evolution strategies—a comprehensive introduction. Nat Comput 1(1):3–52

    Article  MATH  MathSciNet  Google Scholar 

  8. Blau H, Immerman N, Jensen D (2001) A visual query language for relational knowledge discovery. Technical report. University of Massachusetts, Amherst

  9. Callut J, Françoisse K, Saerens M, Dupont P (2008) Semi-supervised classification from discriminative random walks, ECML/PKDD (1), vol 5211 of lecture notes in computer science. Springer, New York, pp 162–177

  10. Chakrabarti D, Kumar R, Tomkins A (2006) Evolutionary clustering. In: twelfth annual SIGKDD international conference on knowledge discovery and data mining and data mining (KDD), pp. 554–560

  11. Chan SY, Hui P, Xu K (2009) Community detection of time-varying mobile social networks, complex (1). Lecture notes of the institute for computer sciences, social informatics and telecommunications engineering, vol 4. Springer, New York, pp 1154–1159

  12. Fortuna B, Rodrigues EM, Milic-Frayling N (2007) Improving the classification of newsgroup messages through social network analysis. In: Proceedings of the sixteenth ACM conference on conference on information and knowledge management, CIKM ’07. ACM, New York. pp 877–880

  13. Freeman LC (1993) Finding groups with a simple genetic algorithm. J Math Sociol 17(4):227–241

    Article  MATH  MathSciNet  Google Scholar 

  14. Gaertler M, Görke R, Wagner D, Wagner S (2006) How to cluster evolving graphs. In: European conference of complex systems, Oxford

  15. Güneş I, Çataltepe Z, Öğüdücü SG (2011) Ga-tvrc: a novel relational time varying classifier to extract temporal information using genetic algorithms, MLDM. Lecture notes in computer science, vol 6871. Springer, New York, pp. 568–583.

  16. Holland J (1975) Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor

    Google Scholar 

  17. Jensen DD, Cohen PR (2000) Multiple comparisons in induction algorithms. Mach Learn 38(3):309–338

    Article  MATH  Google Scholar 

  18. Knowledge Discovery Laboratory Website, University of Massachusetts Amherst, Department of Computer Science (2006a) http://kdl.cs.umass.edu/data/hepth/hepth-info.html. Accessed 17 April 2006

  19. Knowledge Discovery Laboratory Website, University of Massachusetts Amherst, Department of Computer Science (2006b) http://kdl.cs.umass.edu/data/canosleep/canosleep-info.html. Accessed 20–23 Aug 2006

  20. Kumar R, Novak J, Tomkins A (2006) Structure and evolution of online social networks. In: KDD ’06: proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 611–617

  21. Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining. ACM, New York, pp 177–187

  22. Li X, Chen G (2003) A local-world evolving network model. Phys A 328(1–2):274–286

    Article  MATH  MathSciNet  Google Scholar 

  23. Liben-Nowell D, Kleinberg JM (2007) The link-prediction problem for social networks. JASIST 58(7):1019–1031

    Article  Google Scholar 

  24. Ling CX, Huang J, Zhang H (2003) Auc: a statistically consistent and more discriminating measure than accuracy. In: IJCAI. Morgan Kaufmann, San Mateo, pp 519–526

  25. Lipczak M, Milios EE (2009) Agglomerative genetic algorithm for clustering in social networks. In: GECCO. ACM, New York, pp 1243–1250

  26. Macskassy S, Provost F (2003) A simple relational classifier. In: Proceedings of the 2nd workshop on multi-relational data mining, KDD2003, pp 64–76.

  27. Macskassy S, Provost F (2007) Classification in networked data: a toolkit and a univariate case study. J Mach Learn Res 8:935–983

    Google Scholar 

  28. McGovern A, Hiers NC, Collier MW, II DJG, Brown RA (2008) Spatiotemporal relational probability trees: an introduction, ICDM. IEEE Computer Society, New York, pp 935–940

  29. Naruchitparames J, Gunes MH, Louis SJ (2011) Friend recommendations in social networks using genetic algorithms and network topology. In: IEEE congress on evolutionary computation. IEEE, New York, pp 2207–2214

  30. Neville J, Jensen D, Gallagher B (2003) Simple estimators for relational bayesian classifiers. In: Proceedings of the 3rd IEEE international conference on data mining (ICDM 2003), Melbourne. IEEE Computer Society, Washington, DC, 19–22 December 2003, pp 609–612

  31. Newman MEJ (2001) Scientific collaboration networks. I. Network construction and fundamental results. Phys Rev E 64: 016131

    Google Scholar 

  32. Palla G, Barabasi A, Vicsek T (2007) Quantifying social group evolution. Nature 446:664–667

    Article  Google Scholar 

  33. Pastor-Satorras R, Vázquez A, Vespignani A (2001) Dynamical and correlation properties of the Internet. Physl Rev Lett 87(25):258701

    Article  Google Scholar 

  34. Pezzella F, Morganti G, Ciaschetti G (2008) A genetic algorithm for the flexible job-shop scheduling problem. Comput Oper Res 35(10):3202–3212

    Article  MATH  Google Scholar 

  35. Pizzuti C (2008) Ga-net: a genetic algorithm for community detection in social networks, PPSN. Lecture notes in computer science, vol 5199. Springer, Berlin; Heidelberg [u.a.], pp 1081–1090

  36. Rechenberg I (1973) Evolutionsstrategie: optimierung technischer systeme nach prinzipien der biologischen evolution, number 15 in problemata. Frommann-Holzboog, Stuttgart-Bad Cannstatt

  37. Rokach L (2008) Genetic algorithm-based feature set partitioning for classification problems. Pattern Recognit 41(5):1676–1700

    Article  MATH  Google Scholar 

  38. Rossi RA, Neville J (2011) Representations and ensemble methods for dynamic relational classification, CoRR abs/1111.5312

  39. Scala A, Amaral LAN, Barthélémy M (2001) Small-world networks and the conformation space of a short lattice polymer chain. Europhys Lett (EPL) 55(4):594

    Article  Google Scholar 

  40. Segond M, Fonlupt C, Robilliard D (2009) Genetic programming for protein related text classification. In: GECCO. ACM, New York, pp 1099–1106

  41. Sen P, Namata G, Bilgic M, Getoor L, Gallagher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93–106

    Google Scholar 

  42. Setnes M, Babuska R (1999) Fuzzy relational classifier trained by fuzzy clustering. IEEE Trans Syst Man Cybern B 29(5):619–625

    Article  Google Scholar 

  43. Shao H et al. (2008) Fourth international conference on natural computation, ICNC ’08, vol 01. IEEE Computer Society, Washington, DC

  44. Sharan and Neville (2007) Workshop on Web mining and social network analysis, WebKDD/SNA-KDD ’07. ACM, New York

  45. Sharan U, Neville J (2008) Temporal–relational classifiers for prediction in evolving domains. In: ICDM. IEEE Computer Society, New York, pp 540–549

  46. Sole RV, Satorras PR, Smith E, Kepler TB (2002) A model of large-scale proteome evolution. Adv Complex Syst 5(1):43–54

    Article  MATH  Google Scholar 

  47. Tan F, Fu X, Zhang Y, Bourgeois AG (2008) A genetic algorithm-based method for feature subset selection. Soft Comput 12(2):111–120

    Article  Google Scholar 

  48. Tylenda T, Angelova R, Bedathur S (2009) Towards time-aware link prediction in evolving social networks. In: SNAKDD. ACM, New York, p 9

  49. Vapnik VN (1998) Statistical learning theory. Wiley-Interscience, New York

    MATH  Google Scholar 

  50. Watts DJ, Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393(6684):440–442

    Article  Google Scholar 

  51. Weninger T, Hsu WH, Xia J, Aljandal W (2009) An evolutionary approach to constructive induction for link discovery. In: GECCO. ACM, New York, pp 1941–1942

  52. Wilson G, Banzhaf W (2009) Discovery of email communication networks from the enron corpus with a genetic algorithm using social network analysis. In: CEC’09: proceedings of the eleventh conference on congress on evolutionary computation. IEEE Press, New York, pp 3256–3263

  53. Zhang Y, Shen G, Yu Y (2007) Lips: efficient p2p search scheme with novel link prediction techniques. In: ICC. IEEE, New York, pp 1875–1880

  54. Zheleva E, Getoor L (2009) To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: WWW ’09: proceedings of the 18th international conference on world wide web. ACM, New York, pp 531–540

  55. Zhu J, Hong J, Hughes JG (2002) Using markov models for web site link prediction. In: Hypertext. ACM, New York, pp 169–170

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to İsmail Güneş.

Additional information

Responsible editor: Eamonn Keogh.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Güneş, İ., Çataltepe, Z. & Gündüz-Öğüdücü, Ş. GA-TVRC-Het: genetic algorithm enhanced time varying relational classifier for evolving heterogeneous networks. Data Min Knowl Disc 28, 670–701 (2014). https://doi.org/10.1007/s10618-013-0316-z

Download citation

Keywords

  • Network data
  • Heterogeneous networks
  • Evolving networks
  • Social networks
  • Node classification
  • Relational Bayesian classifier
  • Genetic algorithms
  • Evolutionary strategies