Link and Annotation Prediction Using Topology and Feature Structure in Large Scale Social Networks

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 424)


Repeated patterns observed in graph and network structures can be utilized for predictive purposes in various domains including cheminformatics, bioinformatics, political sciences, and sociology. In large scale network structures like social networks, graph theoretical link and annotation prediction algorithms are usually not applicable due to graph isomorphism problem, unless some form of approximation is applied. We propose a non-graph theoretical alternative to link and annotation prediction in large networks by flattening network structures into feature vectors. We extract repeated sub-network pattern vectors for the nodes of a network, and utilize traditional machine learning algorithms for estimating missing or unknown annotations and links in the network. Our main contribution is a novel method for extracting features from large scale networks, and evaluation of the benefit each extraction method provides. We applied our methodology for suggesting new Twitter friends. In our experiments, we observed 11-27% improvement in prediction accuracy when compared to the simple methodology of suggesting friends of friends.


social networks data mining and knowledge discovery big data business intelligence link prediction graph processing graph mining 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Clauset, A., Moore, C., Newman, M.E.J.: Hierarchical structure and the prediction of missing links in networks. Nature 453(7191), 98–101 (2008)CrossRefGoogle Scholar
  2. 2.
    Goldberg, D., Roth, F.: Assessing experimentally derived interactions in a small world. Proc. Natl. Acad. Sci. U.S.A. (2003)Google Scholar
  3. 3.
    Golub, B., Jackson, M.O.: How homophily affects the speed of learning and best-response dynamics. Quarterly Journal of Economics (2012)Google Scholar
  4. 4.
    Gupta, P., Goel, A., Lin, J., Sharma, A., Wang, D., Zadeh, R.: WTF: The Who to Follow Service at Twitter. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, pp. 505–514 (2013)Google Scholar
  5. 5.
    Kirac, M., Ozsoyoglu, G., Yang, J.: Annotating proteins by mining protein interaction networks. Bioinformatics 22(14) (2008)Google Scholar
  6. 6.
    Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a Social Network or a News Media? In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 591–600 (2010)Google Scholar
  7. 7.
    Lee, D.: Document ranking and the vector-space model. IEEE Computer Society 14(2) (1997)Google Scholar
  8. 8.
    Lee, R., Sumiya, K.: Measuring geographical regularities of crowd behaviors for twitter-based geo-social event detection. In: LBSN (2010)Google Scholar
  9. 9.
    Liben-Nowell, D., Kleinberg, J.M.: The link prediction problem for social networks. In: LinkKDD (2004)Google Scholar
  10. 10.
    Milenova, B., Yarmus, J., Campos, M.: SVM in Oracle Database 10g: Removing the Barriers to Widespread Adoption of Support Vector Machines. In: Very Large Databases, VLDB (2005)Google Scholar
  11. 11.
    Pennacchiotti, M., Popescu, A.M.: A machine learning approach to twitter user classification. In: AAAI Conference on Weblogs and Social Media (2011)Google Scholar
  12. 12.
    Sen, P., Namata, G., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Magazine 29(3) (2008)Google Scholar
  13. 13.
    Taskar, B., Wong, M.F., Abbeel, P., Koller, D.: Link prediction in relational data. In: Proceeding of Neural Information Processing Systems (2003)Google Scholar
  14. 14.
    Twitter Inc.: Twitter rest api,
  15. 15.
    Zhou, T., Lu, L., Zhang, Y.C.: Predicting missing links via local information. The European Physical Journal B 71(4) (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Computer EngineeringGebze Insititute of TechnlogyGebzeTurkey
  2. 2.Turkcell, Inc. IstanbulIstanbulTurkey

Personalised recommendations