Link-Based Network Mining

  • Jerry Scripps
  • Ronald Nussbaum
  • Pang-Ning Tan
  • Abdol-Hossein Esfahanian


Network mining is a growing area of research within the data mining community that uses metrics and algorithms from graph theory. In this chapter we present an overview of the different techniques in network mining and suggest future research possibilities in the direction of graph theory.


Network mining Link mining Data mining 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adamic L, Adar E (2003) Friends and neighbors on the web. Soc Networks 25:211–230CrossRefGoogle Scholar
  2. 2.
    Airoldi EM, Carley KM (2005) Sampling algorithms for pure network topologies. SIGKDD Explorations 7:13–22CrossRefGoogle Scholar
  3. 3.
    Backstrom L, Dwork C, Kleinberg J (2007) Wherefore art thou r3579x? Anonymized social networks, hidden patterns, and structural steganography. In: Proceedings of the 16th international World Wide Web conferenceGoogle Scholar
  4. 4.
    Backstrom L, Huttenlocher D, Kleinberg J, Lan X (2006) Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data miningGoogle Scholar
  5. 5.
    Banerjee A, Krumpelman C, Ghosh J, Basu S, Mooney R (2005) Model based overlapping clustering. In: Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data miningGoogle Scholar
  6. 6.
    Barabási A-L, Bonabeau E (2003) Scale-free networks. Sci Am 288:50–59CrossRefGoogle Scholar
  7. 7.
    Basu S, Bilenko M, Mooney R (2004) A probabilistic framework for semi-supervised clustering. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, WAGoogle Scholar
  8. 8.
    Bharathi S, Kempe D, Salek M (2007) Competitive influence maximization in social networks. In: Deng X, Graham FC (eds) Proceedings of WINE 2007. Springer, HeidelbergGoogle Scholar
  9. 9.
    Borgatti SP, Everett MG (1999) Models of core/periphery structures. Soc Networks 21: 375–395CrossRefGoogle Scholar
  10. 10.
    Brandes U, Erlebach T (2005) Network analysis. Lecture Notes in Computer Science. Springer, BerlinMATHCrossRefGoogle Scholar
  11. 11.
    Burk W, Steglich CEG, Snijders TAB (2007) Beyond dyadic interdependence: actor-oriented models for co-evolving social networks and individual behaviors. Int J Behav Dev 31:397CrossRefGoogle Scholar
  12. 12.
    Chakrabarti S, Dom B, Indyk P (1998) Enhanced hypertext categorization using hyperlinks. In: Proceedings of the SIGMOD international conference on management of data. ACM, New York, pp 307–318Google Scholar
  13. 13.
    Chang H, Yeung D-Y (2008) Robust path-based spectral clustering. Pattern Recogn 41: 191–203MATHCrossRefGoogle Scholar
  14. 14.
    Chartrand G, Oellermann O (1992) Applied and algorithmic graph theory. McGraw-Hill, New YorkGoogle Scholar
  15. 15.
    Clauset A, Moore C, Newman MEJ (2006) Structural inference of hierarchies in networks. In: Statistical network analysis: models, issues, and new directions, vol 4503, pp 1–13CrossRefGoogle Scholar
  16. 16.
    Dehmer M, Emmert-Streib F (2008) Structural information content of networks: graph entropy based on local vertex functionals. Comput Biol Chem 32:131–138MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Desikan P, Pathak N, Srivastava J, Kumar V (2005) Incremental page rank computation on evolving graphs. In: Proceedings of the 14th international World Wide Web conference (Special interest tracks and posters)Google Scholar
  18. 18.
    Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data miningGoogle Scholar
  19. 19.
    Domingos P, Richardson M (2001) Mining the network value of customers. In: Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 57–66Google Scholar
  20. 20.
    Doreian P, Batagelj V, Ferligoj A (2005) Positional analysis of sociometric data. In: Carrington P, Scott J, Wasserman S (eds) Models and methods in social network analysis, Cambridge, New YorkGoogle Scholar
  21. 21.
    Erd\ddot{o}s P, R\acute{e}nyi A (1960) On the evolution of random graphs, vol 5. Publications of the institute of Mathematics, Hungarian Academy of Science, pp 17–61Google Scholar
  22. 22.
    Fienberg S (2006) Panel discussion from statistical network analysis: models, issues, and new directions. In: Proceedings of the ICML 2006 workshop on statistical network analysis, Pittsburgh, PA, USAGoogle Scholar
  23. 23.
    Flake G, Tsioutsiouliklis K, Tarjan R (2002) Graph clustering techniques based on minimum cut trees. Technical Report, NEC, Princeton, NJGoogle Scholar
  24. 24.
    Frantz T, Carley KM (2005) A formal characterization of cellular networks. Technical Report CMU-ISRI-05-109, School of Computer Science, Carnegie Mellon UniversityGoogle Scholar
  25. 25.
    Gao J, Tan PN, Cheng H (2006) Semi-supervised clustering with partial background information. In: Proceedings of SDM’06: SIAM international conference on data miningGoogle Scholar
  26. 26.
    Getoor L, Diehl CP (2005) Link mining: a survey. SIGKDD Explorations 7:3–12CrossRefGoogle Scholar
  27. 27.
    Gibson D, Kleinberg J, Raghavan P (1998) Inferring web communities from link topology. In: Proceedings of the 9th ACM conference on hypertext and hypermediaGoogle Scholar
  28. 28.
    Girvan M, Newman M (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99:7821–7826MATHCrossRefMathSciNetGoogle Scholar
  29. 29.
    Goldenberg J, Libai B, Muller E (2001) Using complex systems analysis to advance marketing theory development: modeling heterogeneity effects on new product growth through stochastic cellular automata. Academy of Marketing Science ReviewGoogle Scholar
  30. 30.
    Granovetter M (1978) Threshold models of collective behavior. Am J Sociol 83:1420–1443CrossRefGoogle Scholar
  31. 31.
    Guimer\grave{a} R, Sales-Pardo M, Amaral L (2007) Classes of complex networks defined by role-to-role connectivity profiles. Nat Phys 3:63–69Google Scholar
  32. 32.
    Hanneke S, Xing E (2006) Discrete temporal models of social networks. In: Proceedings of the 23rd international conference on machine learning workshop on statistical network analysisGoogle Scholar
  33. 33.
    Al Hasan M, Chaoji V, Salem S, Zaki M (2006) Link prediction using supervised learning. In: Proceedings of SDM’06: SIAM data mining conference workshop on link analysis, counter-terrorism and SecurityGoogle Scholar
  34. 34.
    Heer J, Boyd D (2005) Vizster: visualizing online social networks. In: Proceedings of IEEE symposium on information visualization. IEEE Press, Minneapolis, MNGoogle Scholar
  35. 35.
    Jackson M (2008) Social networks in economics. In: Benhabib J, Bisin A, Jackson MO (eds) Handbook of social economics. Elsevier, AmsterdamGoogle Scholar
  36. 36.
    Kandel D (1978) Homophily, selection, and socialization in adolescent friendships. Am J Sociol 84:427–436CrossRefGoogle Scholar
  37. 37.
    Karypis G, Kumar V (1995) Analysis of multilevel graph partitioning. SupercomputingGoogle Scholar
  38. 38.
    Karypis G, Kumar V (1999) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20:359–392MATHCrossRefMathSciNetGoogle Scholar
  39. 39.
    Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18: 39–43MATHCrossRefGoogle Scholar
  40. 40.
    Kempe D, Kleinberg J, Tardos E (2003) Maximizing the spread of influence through a social network. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 137–146Google Scholar
  41. 41.
    Kleinberg J (1999) Sources in a hyperlinked environment. J ACM 46:604–632MATHCrossRefMathSciNetGoogle Scholar
  42. 42.
    Krackhardt D, Hanson JR (1993) Informal networks: the company behind the chart. Harvard Bus Rev 71:104–111Google Scholar
  43. 43.
    Lempel R, Moran S (2001) Salsa: the stochastic approach for link-structure analysis. ACM Trans Inf Syst 19:131–160CrossRefGoogle Scholar
  44. 44.
    Leskovec J, Faloutsos C (2006) Sampling from large graphs. In: SIGKDDGoogle Scholar
  45. 45.
    Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery in data miningGoogle Scholar
  46. 46.
    Liben-Nowell D, Kleinberg J (2003) The link prediction problem for social networks. In: Proceedings of the 12th international conference on information and knowledge management, New Orleans, LAGoogle Scholar
  47. 47.
    Lu Q, Getoor L (2003) Link-based classification. In: Proceedings of the 20th international conference on machine learning, ICMLGoogle Scholar
  48. 48.
    Nemhauser GL, Wolsey LA, Fisher ML (1978) An analysis of approximations for maximizing submodular set functions. Math Program 14:265–294MATHCrossRefMathSciNetGoogle Scholar
  49. 49.
    Neville J, Jensen D (2005) Leveraging relational autocorrelation with latent group models. In: Proceedings of the 5th IEEE international conference on data miningGoogle Scholar
  50. 50.
    Newman M, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113CrossRefGoogle Scholar
  51. 51.
    Newman MEJ (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64:025102CrossRefGoogle Scholar
  52. 52.
    O’Madadhain J, Hutchins J, Smyth P (2005) Prediction and ranking algorithms for event-based network data. SIGKDD Explorations 7:23–30CrossRefGoogle Scholar
  53. 53.
    Page L, Brin S, Motwani R, Winograd T (1998) Pagerank citation ranking: bringing order to the web. Technical report, Stanford UniversityGoogle Scholar
  54. 54.
    Potgieter A, April K, Cooke R, Osunmakinde IO (2006) Temporality in link prediction: understanding social complexity. J Trans Eng ManagGoogle Scholar
  55. 55.
    Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D (2004) Defining and identifying communities in networks. Proc Natl Acad Sci USA 101:2658–2663CrossRefGoogle Scholar
  56. 56.
    Rattigan M, Jensen D (2005) The case for anomalous link discovery. SIGKDD Explorations 7:41–47CrossRefGoogle Scholar
  57. 57.
    Scripps J, Tan PN (2006) Clustering in the presence of bridge-nodes. In: Proceedings of SDM’06: SIAM international conference on data mining, Bethesda, MDGoogle Scholar
  58. 58.
    Scripps J, Tan PN, Esfahanian A-H (2007) Exploration of link structure and community-based node roles in network. Technical report, Michigan State UniversityGoogle Scholar
  59. 59.
    Scripps J, Tan PN, Esfahanian A-H (2007) Exploration of link structure and community-based node roles in network analysis. In: Proceedings of the 7th IEEE international conference on data miningGoogle Scholar
  60. 60.
    Senator T (2002) Darpa: evidence extraction and link discovery program. DARPATechGoogle Scholar
  61. 61.
    Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal 22(8):888–905CrossRefGoogle Scholar
  62. 62.
    Skorobogatov VA, Dobrynin AA (1988) Metrical analysis of graphs. MATCH 23:105–155MATHMathSciNetGoogle Scholar
  63. 63.
    Solé RV, Valverde S (2004) Information theory of complex networks: on evolution and architectural constraints. In: Lecture notes in physics, vol 650, pp 189–207Google Scholar
  64. 64.
    Tan P, Steinbach M, Kumar V (2005) Introduction to data mining. Addison Wesley, Boston, MAGoogle Scholar
  65. 65.
    Tantipathananandh C, Berger-Wolf TY, Kempe D (2007) A framework for community identification in dynamic social networks. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 717–726Google Scholar
  66. 66.
    Taskar B, Abbeel P, Koller D (2002) Discriminative probabilistic models for relational data. In: Proceedings of the 18th conference on uncertainty in artificial intelligence (UAI02)Google Scholar
  67. 67.
    Taskar B, Wong MF, Abbeel P, Koller D (2003) Link prediction in relational data. In: Neural information processing systems conference (NIPS03)Google Scholar
  68. 68.
    Tyler JR, Wilkinson DM, Huberman BA (2003) Email as spectroscopy: automated discovery of community structure within organizations. In: Proceedings of the 5th international conference on communities and technologiesGoogle Scholar
  69. 69.
    Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, CambridgeGoogle Scholar
  70. 70.
    Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393:440–442CrossRefGoogle Scholar
  71. 71.
    Yang Y, Slattery S, Ghani R (2002) A study of approaches to hypertext categorization. J Intell Inf Syst 18:219–241CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Jerry Scripps
    • 1
  • Ronald Nussbaum
  • Pang-Ning Tan
  • Abdol-Hossein Esfahanian
  1. 1.School of Computing and Information SystemsGrand Valley State UniversityAllendaleUSA

Personalised recommendations