Data Mining and Knowledge Discovery

, Volume 31, Issue 5, pp 1331–1358 | Cite as

Lagrangian relaxations for multiple network alignment

Article
Part of the following topical collections:
  1. Journal Track of ECML PKDD 2017

Abstract

We propose a principled approach for the problem of aligning multiple partially overlapping networks. The objective is to map multiple graphs into a single graph while preserving vertex and edge similarities. The problem is inspired by the task of integrating partial views of a family tree (genealogical network) into one unified network, but it also has applications, for example, in social and biological networks. Our approach, called Flan, introduces the idea of generalizing the facility location problem by adding a non-linear term to capture edge similarities and to infer the underlying entity network. The problem is solved using an alternating optimization procedure with a Lagrangian relaxation. Flan has the advantage of being able to leverage prior information on the number of entities, so that when this information is available, Flan is shown to work robustly without the need to use any ground truth data for fine-tuning method parameters. Additionally, we present three multiple-network extensions to an existing state-of-the-art pairwise alignment method called Natalie. Extensive experiments on synthetic, as well as real-world datasets on social networks and genealogical networks, attest to the effectiveness of the proposed approaches which clearly outperform a popular multiple network alignment method called IsoRankN.

Keywords

Multiple network alignment Facility location Lagrangian relaxation Genealogical trees Social networks 

References

  1. Althaus E, Canzar S (2008) A Lagrangian relaxation approach for the multiple sequence alignment problem. J Comb Optim 16(2):127–154MathSciNetCrossRefMATHGoogle Scholar
  2. Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512MathSciNetCrossRefMATHGoogle Scholar
  3. Bayati M, Gleich DF, Saberi A, Wang Y (2013) Message-passing algorithms for sparse network alignment. ACM Trans Knowl Discov Data 7(1):3CrossRefGoogle Scholar
  4. Bezdek JC, Hathaway RJ (2003) Convergence of alternating optimization. Neural Parallel Sci Comput 11(4):351–368MathSciNetMATHGoogle Scholar
  5. Bhattacharya I, Getoor L (2007) Collective entity resolution in relational data. ACM Trans Knowl Discov Data 1(1):5CrossRefGoogle Scholar
  6. Christen P (2012) Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection. Springer, BerlinCrossRefGoogle Scholar
  7. Christen P, Vatsalan D, Fu Z (2015) Advanced record linkage methods and privacy aspects for population reconstruction—a survey and case studies. In: Population reconstruction. Springer, pp 87–110Google Scholar
  8. Clark C, Kalita J (2014) A comparison of algorithms for the pairwise alignment of biological networks. Bioinformatics 30(16):2351–2359CrossRefGoogle Scholar
  9. Conte D, Foggia P, Sansone C, Vento M (2004) Thirty years of graph matching in pattern recognition. IJPRAI 18(3):265–298Google Scholar
  10. Cornuejols G, Fisher ML, Nemhauser GL (1977) Location of bank accounts to optimize float: an analytic study of exact and approximate algorithms. Manag Sci 23(8):789–810MathSciNetCrossRefMATHGoogle Scholar
  11. Efremova J, Ranjbar-Sahraei B, Rahmani H, Oliehoek FA, Calders T, Tuyls K, Weiss G (2015) Multi-source entity resolution for genealogical data. In: Population reconstruction. Springer, pp 129–154Google Scholar
  12. El-Kebir M, Heringa J, Klau GW (2015) Natalie 2.0: sparse global network alignment as a special case of quadratic assignment. Algorithms 8(4):1035–1051MathSciNetCrossRefGoogle Scholar
  13. Elmsallati A, Clark C, Kalita J (2015) Global alignment of protein–protein interaction networks: a survey. IEEE/ACM Trans Comput Biol Bioinform PP(99):1-1. doi:10.1109/TCBB.2015.2474391 Google Scholar
  14. Fisher ML (1981) The Lagrangian relaxation method for solving integer programming problems. Manag Sci 27:1–18MathSciNetCrossRefMATHGoogle Scholar
  15. Goga O, Loiseau P, Sommer R, Teixeira R, Gummadi KP (2015) On the reliability of profile matching across large online social networks. In: Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1799–1808Google Scholar
  16. Hochbaum DS (1982) Heuristics for the fixed cost median problem. Math Program 22(1):148–162MathSciNetCrossRefMATHGoogle Scholar
  17. Hu J, Kehr B, Reinert K (2013) NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks. Bioinformatics 30(4):540–548CrossRefGoogle Scholar
  18. Klau GW (2009) A new graph-based method for pairwise global network alignment. BMC Bioinform 10(Suppl 1):S59CrossRefGoogle Scholar
  19. Kouki P, Marcum C, Koehly L, Getoor L (2016) Entity resolution in familial networks. In: Proceedings of the 12th workshop on mining and learning with graphsGoogle Scholar
  20. Liao CS, Lu K, Baym M, Singh R, Berger B (2009) IsoRankN: spectral methods for global alignment of multiple protein networks. Bioinformatics 25(12):i253–i258. doi:10.1093/bioinformatics/btp203 CrossRefGoogle Scholar
  21. Magnani M, Micenkova B, Rossi L (2013) Combinatorial analysis of multiple networks. arXiv:1303.4986
  22. Malmi E, Terzi E, Gionis A (2016) Active network alignment: a matching-based approach. arXiv:1610.05516
  23. Sahraeian SME, Yoon BJ (2013) SMETANA: accurate and scalable algorithm for probabilistic alignment of large-scale biological networks. PLOS ONE 8(7):e67,995CrossRefGoogle Scholar
  24. Shor NZ (2012) Minimization methods for non-differentiable functions, vol 3. Springer, New YorkGoogle Scholar
  25. Singh R, Xu J, Berger B (2008) Global alignment of multiple protein interaction networks with application to functional orthology detection. Proc Natl Acad Sci 105(35):12763–12768CrossRefGoogle Scholar
  26. Singla P, Domingos P (2006) Entity resolution with markov logic. In: Proceedings of the sixth international conference on data mining, ICDM’06. IEEE, pp 572–582Google Scholar
  27. Vazirani VV (2001) Approximation algorithms. Springer, New YorkMATHGoogle Scholar
  28. Winkler WE (1990) String comparator metrics and enhanced decision rules in the fellegi–sunter model of record linkage. In: Proceedings of the section on survey research methods. American Statistical Association, pp 354–359Google Scholar
  29. Zhai Y, Liu B (2005) Web data extraction based on partial tree alignment. In: Proceedings of the 14th international conference on world wide web. ACM, pp 76–85Google Scholar
  30. Zhang J, Yu PS (2015) Multiple anonymized social networks alignment. In: Proceedings of the IEEE international conference on data mining, ICDM’15. IEEEGoogle Scholar

Copyright information

© The Author(s) 2017

Authors and Affiliations

  1. 1.HIITAalto UniversityEspooFinland
  2. 2.Qatar Computing Research Institute, HBKUDohaQatar

Personalised recommendations