Skip to main content

paraSNF: An Parallel Approach for Large-Scale Similarity Network Fusion

  • Conference paper
  • First Online:
  • 778 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 908))

Abstract

With the rapid accumulation of multi-dimensional disease data, the integration of multiple similarity networks is essential for understanding the development of diseases and identifying subtypes of diseases. The recent computational efficient method named SNF is suitable for the integration of similarity networks and has been extensively applied to the bioinformatics analysis. However, the computational complexity and space complexity of the SNF method increases with the increase of the sample numbers. In this research, we develop a parallel SNF algorithm named paraSNF to improve the speed and scalability of the SNF. The experimental results on two large-scale simulation datasets reveal that the paraSNF algorithm is 30x–100x faster than the serial SNF. And the speedup of the paraSNF over the SNF which running on multi-cores with multi-threads is 8x–15x. Furthermore, more than 60% memory space are saved using paraSNF, which can greatly improve the scalability of the SNF.

This work was mainly supported by the National Natural Science Foundation of China under Grant (No. U1435219), the National Key Research and Development Program of China under (No. 2016YFB0200401), grants from the Major Research Plan of the National Natural Science Foundation of China (No. U1435222), National Natural Science Foundation of China (No. 61572515) and the Major Research Plan of the National Key R&D Programof China (No. 2016YFC0901600).

X. Shen and S. He—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Tomczak, K., Czerwińska, P., Wiznerowicz, M.: The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp. Oncol. 19(1A), 68–77 (2015)

    Google Scholar 

  2. Levine, D.A.: Cancer Genome Atlas Research Network. Integrated genomic characterization of endometrial carcinoma. Nature 497(7447), 67 (2013)

    Article  Google Scholar 

  3. Verhaak, R.G.W., Hoadley, K.A., Purdom, E.: Integrated genomic analysis identifies clinically rele-vant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 17(1), 98–110 (2010)

    Article  Google Scholar 

  4. Curtis, C., Shah, S.P., Chin, S.F.: The genomic and transcriptomic architecture of 2,000 breast tu-mours reveals novel subgroups. Nature 486(7403), 346 (2012)

    Article  Google Scholar 

  5. Wang, B., Mezlini, A.M., Demir, F.: Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11(3), 333 (2014)

    Article  Google Scholar 

  6. Shen, R., Olshen, A.B., Ladanyi, M.: Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics 25(22), 2906–2912 (2009)

    Article  Google Scholar 

  7. Yuan, Y., Savage, R.S., Markowetz, F.: Patient-specific data fusion defines prognostic cancer subtypes. PLoS Comput. Biol. 7(10), e1002227 (2011)

    Article  Google Scholar 

  8. He, S., He, H., Xu, W.: ICM: a web server for integrated clustering of multi-dimensional biomedical data. Nucl. Acids Res. 44(W1), W154–W159 (2016)

    Article  Google Scholar 

  9. Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002)

    Google Scholar 

  10. Vanderwiel, S.P., Lilja, D.J.: Data prefetch mechanisms. ACM Comput. Surv. (CSUR) 32(2), 174–199 (2000)

    Article  Google Scholar 

  11. Liu, P., Yu, J., Huang, M.C.: Thread-aware adaptive prefetcher on multicore systems: improving the performance for multithreaded workloads. ACM Trans. Arch. Code Optim. (TACO) 13(1), 13 (2016)

    Google Scholar 

  12. Krishnan, M., Nieplocha, J.: SRUMMA: a matrix multiplication algorithm suitable for clusters and scalable shared memory systems. In: 18th International Parallel and Distributed Processing Symposium. Proceedings. IEEE, p. 70 (2004)

    Google Scholar 

  13. Schatz, M.D., Van de Geijn, R.A., Poulson, J.: Parallel matrix multiplication: a systematic journey. SIAM J. Sci. Comput. 38(6), C748–C781 (2016)

    Article  MathSciNet  Google Scholar 

  14. Li, D., Xu, C., Cheng, B.: Performance modeling and optimization of parallel LU-SGS on many-core processors for 3D high-order CFD simulations. J. Supercomput. 73(6), 2506–2524 (2017)

    Article  Google Scholar 

  15. Chen, C., Fang, J., Tang, T.: LU factorization on heterogeneous systems: an energy-efficient approach towards high performance. Computing 99(8), 1–21 (2017)

    Article  MathSciNet  Google Scholar 

  16. Haixia, L.I.: Application of Cannon algorithm on parallel computers. J. Huangshi Inst. Technol. 3, 006 (2010)

    Google Scholar 

  17. Supplementary Data. https://www.nature.com/articles/nmeth.2810#supplementary-information

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xiaochen Bo or Yong Dou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shen, X., He, S., Fang, M., Wen, Y., Bo, X., Dou, Y. (2018). paraSNF: An Parallel Approach for Large-Scale Similarity Network Fusion. In: Li, C., Wu, J. (eds) Advanced Computer Architecture. ACA 2018. Communications in Computer and Information Science, vol 908. Springer, Singapore. https://doi.org/10.1007/978-981-13-2423-9_12

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-2423-9_12

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-2422-2

  • Online ISBN: 978-981-13-2423-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics