Towards Heterogeneous Network Alignment: Design and Implementation of a Large-Scale Data Processing Framework

  • Marianna Milano
  • Pierangelo Veltri
  • Mario Cannataro
  • Pietro H. GuzziEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11339)


The importance of the use of networks to model and analyse biological data and the interplay of bio-molecules is widely recognised. Consequently, many algorithms for the analysis and the comparison of networks (such as alignment algorithms) have been developed in the past. Recently, many different approaches tried to integrate into a single model the interplay of different molecules, such as genes, transcription factors and microRNAs. A possible formalism to model such scenario comes from node coloured networks (or heterogeneous networks) implemented as node/ edge-coloured graphs. Consequently, the need for the introduction of alignment algorithms able to analyse heterogeneous networks arises. To the best of our knowledge, all the existing algorithms are not able to mine heterogeneous networks. We propose a two-step alignment strategy that receives as input two heterogeneous networks (node-coloured graphs) and a similarity function among nodes of two networks extending the previous formulations. We first build a single alignment graph. Then we mine this graph extracting relevant subgraphs. Despite this simple approach, the analysis of such networks relies on graph and subgraph isomorphism and the size of the data is still growing. Therefore the use of high-performance data analytics framework is needed. We here present HetNetAligner a framework built on top of Apache Spark. We also implemented our algorithm, and we tested it on some selected heterogeneous biological networks. Preliminary results confirm that our method may extract relevant knowledge from biological data reducing the computational time.


Heterogeneous network Network alignment Apache Spark 



This work has been partially supported by Fondo di Finanziamento per le Attivitá Base di Ricerca (FFABR 2017) of Prof. Pietro Hiram Guzzi.


  1. 1.
    Berg, J., Lässig, M.: Local graph alignment and motif search in biological networks. Proc. Natl. Acad. Sci. U. S. A. 101(41), 14689–14694 (2004)CrossRefGoogle Scholar
  2. 2.
    Cannataro, M., Guzzi, P.H., Veltri, P.: Protein-to-protein interactions: technologies, databases, and algorithms. ACM Comput. Surv. (CSUR) 43(1), 1 (2010)CrossRefGoogle Scholar
  3. 3.
    Ciriello, G., Mina, M., Guzzi, P.H., Cannataro, M., Guerra, C.: Alignnemo: a local network alignment method to integrate homology and topology. PLoS ONE 7(6), e38107 (2012)CrossRefGoogle Scholar
  4. 4.
    Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Syst. 1695(5), 1–9 (2006)Google Scholar
  5. 5.
    Di Martino, M.T., et al.: Integrated analysis of micrornas, transcription factors and target genes expression discloses a specific molecular architecture of hyperdiploid multiple myeloma. Oncotarget 6(22), 19132 (2015)CrossRefGoogle Scholar
  6. 6.
    Enright, A.J., Van Dongen, S., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30(7), 1575–1584 (2002)CrossRefGoogle Scholar
  7. 7.
    Gligorijevic, V., Malod-Dognin, N., Przulj, N.: Integrative methods for analyzing big data in precision medicine. Proteomics 16(5), 741–758 (2016)CrossRefGoogle Scholar
  8. 8.
    Guzzi, P.H., Di Martino, M.T., Tagliaferri, P., Tassone, P., Cannataro, M.: Analysis of miRNA, mRNA, and TF interactions through network-based methods. EURASIP J. Bioinform. Syst. Biol. 2015(1), 1–11 (2015)CrossRefGoogle Scholar
  9. 9.
    Hagberg, A., Swart, P., Chult, D.S.: Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab. (LANL), Los Alamos, NM, United States (2008)Google Scholar
  10. 10.
    Hu, J., Reinert, K.: Localali: an evolutionary-based local alignment approach to identify functionally conserved modules in multiple networks. Bioinformatics 31(3), 363–372 (2014)CrossRefGoogle Scholar
  11. 11.
    Ideker, T., Nussinov, R.: Network approaches and applications in biology. PLoS Comput. Biol. 13(10), e1005771 (2017)CrossRefGoogle Scholar
  12. 12.
    Ihaka, R., Gentleman, R.: R: a language for data analysis and graphics. J. Comput. Graph. Stat. 5(3), 299–314 (1996)Google Scholar
  13. 13.
    Koyuturk, M., Kim, Y., Topkara, U., Subramaniam, S., Szpankowski, W., Grama, A.: Pairwise alignment of protein interaction networks. J. Comput. Biol. 13(2), 182–199 (2006)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Meng, X., et al.: MLlib: machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235–1241 (2016)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Mina, M., Guzzi, P.H.: Improving the robustness of local network alignment: design and extensive assessmentof a Markov clustering-based approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 11(3), 561–572 (2014)CrossRefGoogle Scholar
  16. 16.
    Navarro, C., Martínez, V., Blanco, A., Cano, C.: ProphTools: general prioritization tools for heterogeneous biological networks. GigaScience 6(12), 1–8 (2017)CrossRefGoogle Scholar
  17. 17.
    Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. In: BMC Bioinformatics, vol. 11, p. S1. BioMed Central (2010)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Yap, P.-T., Wu, G., Shen, D.: Human brain connectomics: networks, techniques, and applications [life sciences]. IEEE Signal Process. Mag. 27(4), 131–134 (2010)CrossRefGoogle Scholar
  19. 19.
    Zaharia, M., et al.: Apache spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016)CrossRefGoogle Scholar
  20. 20.
    Zikopoulos, P., Eaton, C., et al.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill Osborne Media, New York (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Medical and Surgical ScienceUniversity Magna GræciaCatanzaroItaly
  2. 2.Data Analytics Research CentreUniversity of CatanzaroCatanzaroItaly

Personalised recommendations