Abstract
In this paper, we consider the link prediction problem, where we are given a partial snapshot of a network at some time and the goal is to predict the additional links formed at a later time. The accuracy of current prediction methods is quite low due to the extreme class skew and the large number of potential links. Here, we describe learning algorithms based on chance constrained programs and show that they exhibit all the properties needed for a good link predictor, namely, they allow preferential bias to positive or negative class; handle skewness in the data; and scale to large networks. Our experimental results on three real-world domains—co-authorship networks, biological networks and citation networks—show significant performance improvement over baseline algorithms. We conclude by briefly describing some promising future directions based on this work.
Chapter PDF
Similar content being viewed by others
Keywords
- Citation Network
- Link Prediction
- Prediction Task
- Chance Constraint
- Defense Advance Research Project Agency
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Adamic, L., Adar, E.: Friends and neighbors on the web. Social Networks 25, 211–230 (2001)
Bennett, K.P., Bredensteiner, E.J.: Duality and geometry in svm classifiers. In: International Conf. on Machine Learning (ICML), pp. 57–64 (2000)
Bhattacharyya, C.: Second order cone programming formulations for feature selection. Journal of Machine Learning Research (JMLR) 5, 1417–1433 (2004)
Bilgic, M., Namata, G.M., Getoor, L.: Combining collective classification and link prediction. In: Workshop on Mining Graphs and Complex Structures at the IEEE International Conference on Data Mining(ICDM) (2007)
Calafiore, G., Campi, M.: Uncertain convex programs: Randomized solutions and confidence levels. Mathematical Programming 102, 25–46 (2005)
Farias, D.P.D., Roy, B.V.: On constraint sampling in the linear programming approach to approximate dynamic programming. Mathematics of Operations Research 29(3), 462–478 (2001)
Getoor, L., Friedman, N., Koller, D., Taskar, B.: Learning probabilistic models of relational structure. In: International Conference on Machine Learning(ICML) (2001)
Getoor, L., Friedman, N., Koller, D., Taskar, B.: Learning probabilistic models of link structure. Journal of Machine Learning Research 3, 679–707 (2002)
Hasan, M., Chaoji, V., Salem, S., Zaki, M.: Link prediction using supervised learning. In: SDM Workshop on Link Analysis Counter-terrorism and Security (2006)
Jagarlapudi, S.N.: Learning Algorithms using Chance-Constrained Programming. Ph.d dissertation, Computer Science and Automation, IISc Bangalore (2007)
Kashima, H., Abe, N.: A parameterized probabilistic model of network evolution for supervised link prediction. In: International Conference on Data Mining (ICDM) (2006)
Lanckriet, G., Ghaoui, L.E., Bhattacharya, C., Jordan, M.: Minimax probability machine. In: Annual Conference on Neural Information Processing Systems (NIPS) (2001)
Li, Y., Zaragoza, H., Herbrich, R., Shawe-Taylor, J., Kandola, J.S.: The perceptron algorithm with uneven margins. In: International Conference on Machine Learning (ICML), pp. 379–386 (2002)
Libennowell, D., Kleinberg, J.: The link prediction problem for social networks. In: International Conference on Knowledge Management(CIKM) (2003)
Lobo, M.S., Vandenberghe, L., Boyd, S., Lebret, H.: Applications of second-order cone programming. Linear Algebra and its Applications 238, 193–228 (1998)
Marshall, A., Olkin, I.: Multivariate chebyshev inequalities. Annals of Mathematical Statistics 31, 1001–1014 (1960)
Nath, J.S., Bhattacharyya, C.: Maximum margin classifiers with specified false positive and false negative error rates. In: SIAM International Conference on Data Mining (SDM) (2007)
Nath, J.S., Bhattacharyya, C., Murty, M.N.: Clustering based large margin classification: a scalable approach using socp formulation. In: International Conference on Knowledge Discovery and Data Mining(KDD) (2006)
Newman, M.: Clustering and preferential attachment in growing networks. Physical Review Letters, 64 (2001)
Popescul, A., Popescul, R., Ungar, L.: Statistical relational learning for link prediction. In: IJCAI workshop on Learning Statistical Models for Relational Data (2003)
Rattngon, M., Jensen, D.: The case for anomalous link discovery. SIGKDD Explorations 7(2), 41–47 (2005)
Sarkar, P., Chakrabarti, D., Moore, A.: Theoretical justification of popular link prediction heuristics. In: International Conference on Learning Theory (COLT), pp. 295–307 (2010)
Shivaswamy, P.K., Bhattacharyya, C., Smola, A.J.: Second order cone programming approaches for handling missing and uncertain data. Journal of Machine Learning Research (JMLR) 7, 1283–1314 (2006)
Taskar, B., Fai Wong, M., Abbeel, P., Koller, D.: Link prediction in relational data. In: Annual Conference on Neural Information Processing Systems (NIPS) (2003)
Wang, C., Satuluri, V., Parthasarathy, S.: Local probabilistic models for link prediction. In: International Conference on Data Mining (ICDM) (2007)
Wick, M., Rohanimanesh, K., Culotta, A., McCallum, A.: Samplerank: Learning preferences from atomic gradients. In: Neural Information Processing Systems (NIPS) Workshop on Advances in Ranking (2009)
Zheleva, E., Getoor, L., Golbeck, J., Kuter, U.: Using friendship ties and family circles for link prediction. In: 2nd ACM SIGKDD Workshop on Social Network Mining and Analysis (SNA-KDD) (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Doppa, J.R., Yu, J., Tadepalli, P., Getoor, L. (2010). Learning Algorithms for Link Prediction Based on Chance Constraints. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15880-3_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-15880-3_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15879-7
Online ISBN: 978-3-642-15880-3
eBook Packages: Computer ScienceComputer Science (R0)