Data Mining and Knowledge Discovery

, Volume 28, Issue 3, pp 670–701 | Cite as

GA-TVRC-Het: genetic algorithm enhanced time varying relational classifier for evolving heterogeneous networks

  • İsmail Güneş
  • Zehra Çataltepe
  • Şule Gündüz-Öğüdücü
Article

Abstract

Evolving heterogeneous networks, which contain different types of nodes and links that change over time, appear in many domains including protein–protein interactions, scientific collaborations, telecommunications. In this paper, we aim to discover temporal information from a heterogenous evolving network in order to improve node classification. We propose a framework, Genetic Algorithm enhanced Time Varying Relational Classifier for evolving Heterogeneous Networks (GA-TVRC-Het), to extract the effects of different relationship types in different time periods in the past. These effects are discovered adaptively by utilizing genetic algorithms. A relational classifier is extended as the classification method in order to be able to work with different types of nodes. The proposed framework is tested on two real world data sets. It is shown that using the optimal time effect improves the classification performance to a large extent. It is observed that the optimal time effect does not necessarily follow a certain functional trend, for example linear or exponential decay in time. Another observation is that the optimal time effect may be different for each type of interaction. Both observations reveal the reason why GA-TVRC-Het outperforms methods that rely on a predefined form of time effect or the same time effect for each link type.

Keywords

Network data Heterogeneous networks Evolving networks Social networks Node classification Relational Bayesian classifier Genetic algorithms Evolutionary strategies 

References

  1. Acar E, Dunlavy DM, Kolda TG (2009) Link prediction on evolving data using matrix and tensor factorizations. In: ICDM workshops. IEEE Computer Society, New York, pp 262–269Google Scholar
  2. Aggarwal C, Li N (2011) On node classification in dynamic content-based networks. SDM, SIAM/Omnipress, Eastbourne, pp 355–366Google Scholar
  3. Albert R, Jeong H, Barabási AL (1999) Diameter of the world wide web. Nature 401:130–131CrossRefGoogle Scholar
  4. Atak AA, Öğüdücü SG (2010) A framework for social spam detection based on relational bayes classifier. DMIN, CSREA Press, Las Vegas NevadaGoogle Scholar
  5. Barabasi AL, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512CrossRefMathSciNetGoogle Scholar
  6. Barabasi AL, Jeong H, Neda Z, Ravasz E, Schubert A, Vicsek T (2002) Evolution of the social network of scientific collaborations. Phys A 311(3–4):590–614CrossRefMATHMathSciNetGoogle Scholar
  7. Beyer HG, Schwefel HP (2002) Evolution strategies—a comprehensive introduction. Nat Comput 1(1):3–52CrossRefMATHMathSciNetGoogle Scholar
  8. Blau H, Immerman N, Jensen D (2001) A visual query language for relational knowledge discovery. Technical report. University of Massachusetts, AmherstGoogle Scholar
  9. Callut J, Françoisse K, Saerens M, Dupont P (2008) Semi-supervised classification from discriminative random walks, ECML/PKDD (1), vol 5211 of lecture notes in computer science. Springer, New York, pp 162–177Google Scholar
  10. Chakrabarti D, Kumar R, Tomkins A (2006) Evolutionary clustering. In: twelfth annual SIGKDD international conference on knowledge discovery and data mining and data mining (KDD), pp. 554–560Google Scholar
  11. Chan SY, Hui P, Xu K (2009) Community detection of time-varying mobile social networks, complex (1). Lecture notes of the institute for computer sciences, social informatics and telecommunications engineering, vol 4. Springer, New York, pp 1154–1159Google Scholar
  12. Fortuna B, Rodrigues EM, Milic-Frayling N (2007) Improving the classification of newsgroup messages through social network analysis. In: Proceedings of the sixteenth ACM conference on conference on information and knowledge management, CIKM ’07. ACM, New York. pp 877–880Google Scholar
  13. Freeman LC (1993) Finding groups with a simple genetic algorithm. J Math Sociol 17(4):227–241CrossRefMATHMathSciNetGoogle Scholar
  14. Gaertler M, Görke R, Wagner D, Wagner S (2006) How to cluster evolving graphs. In: European conference of complex systems, OxfordGoogle Scholar
  15. Güneş I, Çataltepe Z, Öğüdücü SG (2011) Ga-tvrc: a novel relational time varying classifier to extract temporal information using genetic algorithms, MLDM. Lecture notes in computer science, vol 6871. Springer, New York, pp. 568–583.Google Scholar
  16. Holland J (1975) Adaptation in natural and artificial systems. The University of Michigan Press, Ann ArborGoogle Scholar
  17. Jensen DD, Cohen PR (2000) Multiple comparisons in induction algorithms. Mach Learn 38(3):309–338CrossRefMATHGoogle Scholar
  18. Knowledge Discovery Laboratory Website, University of Massachusetts Amherst, Department of Computer Science (2006a) http://kdl.cs.umass.edu/data/hepth/hepth-info.html. Accessed 17 April 2006
  19. Knowledge Discovery Laboratory Website, University of Massachusetts Amherst, Department of Computer Science (2006b) http://kdl.cs.umass.edu/data/canosleep/canosleep-info.html. Accessed 20–23 Aug 2006
  20. Kumar R, Novak J, Tomkins A (2006) Structure and evolution of online social networks. In: KDD ’06: proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 611–617Google Scholar
  21. Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining. ACM, New York, pp 177–187Google Scholar
  22. Li X, Chen G (2003) A local-world evolving network model. Phys A 328(1–2):274–286CrossRefMATHMathSciNetGoogle Scholar
  23. Liben-Nowell D, Kleinberg JM (2007) The link-prediction problem for social networks. JASIST 58(7):1019–1031CrossRefGoogle Scholar
  24. Ling CX, Huang J, Zhang H (2003) Auc: a statistically consistent and more discriminating measure than accuracy. In: IJCAI. Morgan Kaufmann, San Mateo, pp 519–526Google Scholar
  25. Lipczak M, Milios EE (2009) Agglomerative genetic algorithm for clustering in social networks. In: GECCO. ACM, New York, pp 1243–1250Google Scholar
  26. Macskassy S, Provost F (2003) A simple relational classifier. In: Proceedings of the 2nd workshop on multi-relational data mining, KDD2003, pp 64–76.Google Scholar
  27. Macskassy S, Provost F (2007) Classification in networked data: a toolkit and a univariate case study. J Mach Learn Res 8:935–983Google Scholar
  28. McGovern A, Hiers NC, Collier MW, II DJG, Brown RA (2008) Spatiotemporal relational probability trees: an introduction, ICDM. IEEE Computer Society, New York, pp 935–940Google Scholar
  29. Naruchitparames J, Gunes MH, Louis SJ (2011) Friend recommendations in social networks using genetic algorithms and network topology. In: IEEE congress on evolutionary computation. IEEE, New York, pp 2207–2214Google Scholar
  30. Neville J, Jensen D, Gallagher B (2003) Simple estimators for relational bayesian classifiers. In: Proceedings of the 3rd IEEE international conference on data mining (ICDM 2003), Melbourne. IEEE Computer Society, Washington, DC, 19–22 December 2003, pp 609–612Google Scholar
  31. Newman MEJ (2001) Scientific collaboration networks. I. Network construction and fundamental results. Phys Rev E 64: 016131Google Scholar
  32. Palla G, Barabasi A, Vicsek T (2007) Quantifying social group evolution. Nature 446:664–667CrossRefGoogle Scholar
  33. Pastor-Satorras R, Vázquez A, Vespignani A (2001) Dynamical and correlation properties of the Internet. Physl Rev Lett 87(25):258701CrossRefGoogle Scholar
  34. Pezzella F, Morganti G, Ciaschetti G (2008) A genetic algorithm for the flexible job-shop scheduling problem. Comput Oper Res 35(10):3202–3212CrossRefMATHGoogle Scholar
  35. Pizzuti C (2008) Ga-net: a genetic algorithm for community detection in social networks, PPSN. Lecture notes in computer science, vol 5199. Springer, Berlin; Heidelberg [u.a.], pp 1081–1090Google Scholar
  36. Rechenberg I (1973) Evolutionsstrategie: optimierung technischer systeme nach prinzipien der biologischen evolution, number 15 in problemata. Frommann-Holzboog, Stuttgart-Bad CannstattGoogle Scholar
  37. Rokach L (2008) Genetic algorithm-based feature set partitioning for classification problems. Pattern Recognit 41(5):1676–1700CrossRefMATHGoogle Scholar
  38. Rossi RA, Neville J (2011) Representations and ensemble methods for dynamic relational classification, CoRR abs/1111.5312Google Scholar
  39. Scala A, Amaral LAN, Barthélémy M (2001) Small-world networks and the conformation space of a short lattice polymer chain. Europhys Lett (EPL) 55(4):594CrossRefGoogle Scholar
  40. Segond M, Fonlupt C, Robilliard D (2009) Genetic programming for protein related text classification. In: GECCO. ACM, New York, pp 1099–1106Google Scholar
  41. Sen P, Namata G, Bilgic M, Getoor L, Gallagher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93–106Google Scholar
  42. Setnes M, Babuska R (1999) Fuzzy relational classifier trained by fuzzy clustering. IEEE Trans Syst Man Cybern B 29(5):619–625CrossRefGoogle Scholar
  43. Shao H et al. (2008) Fourth international conference on natural computation, ICNC ’08, vol 01. IEEE Computer Society, Washington, DCGoogle Scholar
  44. Sharan and Neville (2007) Workshop on Web mining and social network analysis, WebKDD/SNA-KDD ’07. ACM, New YorkGoogle Scholar
  45. Sharan U, Neville J (2008) Temporal–relational classifiers for prediction in evolving domains. In: ICDM. IEEE Computer Society, New York, pp 540–549Google Scholar
  46. Sole RV, Satorras PR, Smith E, Kepler TB (2002) A model of large-scale proteome evolution. Adv Complex Syst 5(1):43–54CrossRefMATHGoogle Scholar
  47. Tan F, Fu X, Zhang Y, Bourgeois AG (2008) A genetic algorithm-based method for feature subset selection. Soft Comput 12(2):111–120CrossRefGoogle Scholar
  48. Tylenda T, Angelova R, Bedathur S (2009) Towards time-aware link prediction in evolving social networks. In: SNAKDD. ACM, New York, p 9Google Scholar
  49. Vapnik VN (1998) Statistical learning theory. Wiley-Interscience, New YorkMATHGoogle Scholar
  50. Watts DJ, Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393(6684):440–442CrossRefGoogle Scholar
  51. Weninger T, Hsu WH, Xia J, Aljandal W (2009) An evolutionary approach to constructive induction for link discovery. In: GECCO. ACM, New York, pp 1941–1942Google Scholar
  52. Wilson G, Banzhaf W (2009) Discovery of email communication networks from the enron corpus with a genetic algorithm using social network analysis. In: CEC’09: proceedings of the eleventh conference on congress on evolutionary computation. IEEE Press, New York, pp 3256–3263Google Scholar
  53. Zhang Y, Shen G, Yu Y (2007) Lips: efficient p2p search scheme with novel link prediction techniques. In: ICC. IEEE, New York, pp 1875–1880Google Scholar
  54. Zheleva E, Getoor L (2009) To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: WWW ’09: proceedings of the 18th international conference on world wide web. ACM, New York, pp 531–540Google Scholar
  55. Zhu J, Hong J, Hughes JG (2002) Using markov models for web site link prediction. In: Hypertext. ACM, New York, pp 169–170Google Scholar

Copyright information

© The Author(s) 2013

Authors and Affiliations

  • İsmail Güneş
    • 1
  • Zehra Çataltepe
    • 1
  • Şule Gündüz-Öğüdücü
    • 1
  1. 1.Computer Engineering DepartmentIstanbul Technical UniversityIstanbulTurkey

Personalised recommendations