Advertisement

1,000x Faster Than PLINK: Genome-Wide Epistasis Detection with Logistic Regression Using Combined FPGA and GPU Accelerators

  • Lars Wienbrandt
  • Jan Christian Kässens
  • Matthias Hübenthal
  • David Ellinghaus
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10861)

Abstract

Logistic regression as implemented in PLINK is a powerful and commonly used framework for assessing gene-gene (GxG) interactions. However, fitting regression models for each pair of markers in a genome-wide dataset is a computationally intensive task. Performing billions of tests with PLINK takes days if not weeks, for which reason pre-filtering techniques and fast epistasis screenings are applied to reduce the computational burden.

Here, we demonstrate that employing a combination of a Xilinx UltraScale KU115 FPGA with an Nvidia Tesla P100 GPU leads to runtimes of only minutes for logistic regression GxG tests on a genome-wide scale. In particular, a dataset of 53,000 samples genotyped at 130,000 SNPs was analyzed in 8 min, resulting in a speedup of more than 1,000 when compared to PLINK v1.9 using 32 threads on a server-grade computing platform. Furthermore, on-the-fly calculation of test statistics, p-values and LD-scores in double-precision make commonly used pre-filtering strategies obsolete.

Keywords

Genome-wide association study (GWAS) Genome-wide interaction study (GWIS) Gene-gene (GxG) interaction Linkage disequilibrium (LD) BOOST Hardware accelerator Hybrid computing Heterogeneous architecture 

References

  1. 1.
    Atkinson, M.D., Sack, J.R., Santori, N., et al.: Min-max heaps and generalized priority queues. Commun. ACM 29(10), 996–1000 (1986)CrossRefGoogle Scholar
  2. 2.
    Bulik-Sullivan, B.K., Loh, P.R., Finucane, H.K., et al.: LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).  https://doi.org/10.1038/ng.3211CrossRefGoogle Scholar
  3. 3.
    Cattaert, T., Calle, M.L., Dudek, S.M., et al.: Model-based multifactor dimensionality reduction for detecting epistasis in case-control data in the presence of noise. Ann. Hum. Genet. 75(1), 78–89 (2011)CrossRefGoogle Scholar
  4. 4.
    Chang, C.C., Chow, C.C., Tellier, L.C., Vattikuti, S., Purcell, S.M., Lee, J.J.: Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 1–16 (2015).  https://doi.org/10.1186/s13742-015-0047-8CrossRefGoogle Scholar
  5. 5.
    Ferrario, P.G., König, I.R.: Transferring entropy to the realm of GxG interactions. Briefings in Bioinform., 1–12 (2016).  https://doi.org/10.1093/bib/bbw086
  6. 6.
    Genetic Analysis of Psoriasis Consortium, et al.: A genome-wide association study identifies new psoriasis susceptibility loci and an interaction between HLA-C and ERAP1. Nat. Genet. 42, 985–990 (2010).  https://doi.org/10.1038/ng.694
  7. 7.
    Goudey, B., Rawlinson, D., Wang, Q., et al.: GWIS: model-free, fast and exhaustive search for epistatic interactions in case-control GWAS. Lorne Genome 2013 (2013)Google Scholar
  8. 8.
    Gyenesei, A., Moody, J., Semple, C.A., et al.: High-throughput analysis of epistasis in genome-wide association studies with BiForce. Bioinformatics 28(15), 1957–1964 (2012).  https://doi.org/10.1093/bioinformatics/bts304CrossRefGoogle Scholar
  9. 9.
    Hu, X., Liu, Q., Zhang, Z., et al.: SHEsisEpi, a GPU-enhanced genome-wide SNP-SNP interaction scanning algorithm, efficiently reveals the risk genetic epistasis in bipolar disorder. Cell Res. 20, 854–857 (2010)CrossRefGoogle Scholar
  10. 10.
    Ibrahim, Z.M., Newhouse, S., Dobson, R.: Detecting epistasis in the presence of linkage disequilibrium: a focused comparison. In: 2013 IEEE Symposium on CIBCB, pp. 96–103 (2013).  https://doi.org/10.1109/CIBCB.2013.6595394
  11. 11.
    Kam-Thong, T., Azencott, C.A., Cayton, L., et al.: GLIDE: GPU-based linear regression for detection of epistasis. Hum. Hered. 73, 220–236 (2012).  https://doi.org/10.1159/000341885CrossRefGoogle Scholar
  12. 12.
    Kässens, J.C., Wienbrandt, L., et al.: Combining GPU and FPGA technology for efficient exhaustive interaction analysis in GWAS. In: 2016 IEEE 27th International Conference on ASAP, pp. 170–175 (2016).  https://doi.org/10.1109/ASAP.2016.7760788
  13. 13.
    Kässens, J.C.: A hybrid-parallel architecture for applications in bioinformatics. No. 2017/4 in Kiel Computer Science Series, Department of Computer Science, CAU Kiel (2017). Dissertation, Faculty of Engineering, Kiel University.  https://doi.org/10.21941/kcss/2017/4
  14. 14.
    Keaton, J.M., Hellwege, J.N., Ng, M.C.Y., et al.: Genome-wide interaction with selected type 2 diabetes loci reveals novel loci for type 2 diabetes in African Americans. Pac. Symp. Biocomput. 22, 242–253 (2016).  https://doi.org/10.1142/9789813207813_0024CrossRefGoogle Scholar
  15. 15.
    Kirino, Y., Bertsias, G., Ishigatsubo, Y., et al.: Genome-wide association analysis identifies new susceptibility loci for Behçet’s disease and epistasis between HLA-B*51 and ERAP1. Nat. Genet. 45, 202–207 (2013).  https://doi.org/10.1038/ng.2520CrossRefGoogle Scholar
  16. 16.
    Lee, S., Kwon, M.S., Park, T.: CARAT-GxG: CUDA-accelerated regression analysis toolkit for large-scale gene-gene interaction with GPU computing system. Cancer Inform. 13s7, CIN.S16349 (2014).  https://doi.org/10.4137/CIN.S16349
  17. 17.
    van Leeuwen, E.M., Smouter, F.A.S., Kam-Thong, T., et al.: The challenges of genome-wide interaction studies: lessons to learn from the analysis of HDL blood levels. PLoS ONE 9, e109290 (2014).  https://doi.org/10.1371/journal.pone.0109290CrossRefGoogle Scholar
  18. 18.
    Piriyapongsa, J., Ngamphiw, C., Intarapanich, A., et al.: iLOCi: a SNP interaction prioritization technique for detecting epistasis in genome-wide association studies. BMC Genom. 13(Suppl 7), S2 (2012).  https://doi.org/10.1186/1471-2164-13-s7-s2CrossRefGoogle Scholar
  19. 19.
    Purcell, S., Chang, C.: PLINK v1.90p 64-bit, 9 January 2018. www.cog-genomics.org/plink/1.9/
  20. 20.
    Purcell, S., Neale, B., Todd-Brown, K., et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).  https://doi.org/10.1086/519795CrossRefGoogle Scholar
  21. 21.
    The Australo-Anglo-American Spondyloarthritis Consortium (TASC), et al.: Interaction between ERAP1 and HLA-B27 in ankylosing spondylitis implicates peptide handling in the mechanism for HLA-B27 in disease susceptibility. Nat. Genet. 43, 761–767 (2011).  https://doi.org/10.1038/ng.873
  22. 22.
    Ueki, M., Cordell, H.J.: Improved statistics for genome-wide interaction analysis. PLoS Genet. 8(4), e1002625 (2012).  https://doi.org/10.1371/journal.pgen.1002625CrossRefGoogle Scholar
  23. 23.
    Wan, X., Yang, C., Yang, Q., et al.: BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am. J. Hum. Genet. 87(3), 325–340 (2010)CrossRefGoogle Scholar
  24. 24.
    Wang, Y., Liu, G., Feng, M., Wong, L.: An empirical comparison of several recent epistatic interaction detection methods. Bioinformatics 27(21), 2936–2943 (2011)CrossRefGoogle Scholar
  25. 25.
    Wienbrandt, L., Kässens, J.C., González-Domínguez, J., et al.: FPGA-based acceleration of detecting statistical epistasis in GWAS. Proc. Comput. Sci. 29, 220–230 (2014).  https://doi.org/10.1016/j.procs.2014.05.020CrossRefGoogle Scholar
  26. 26.
    Wienbrandt, L., Kässens, J.C., et al.: Fast genome-wide third-order SNP interaction tests with information gain on a low-cost heterogeneous parallel FPGA-GPU computing architecture. Proc. Comput. Sci. 108, 596–605 (2017).  https://doi.org/10.1016/j.procs.2017.05.210CrossRefGoogle Scholar
  27. 27.
    Xie, M., Li, J., Jiang, T.: Detecting genome-wide epistases based on the clustering of relatively frequent items. Bioinformatics 28(1), 5–12 (2012)CrossRefGoogle Scholar
  28. 28.
    Yung, L.S., Yang, C., Wan, X., et al.: GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies. Bioinformatics 27(9), 1309–1310 (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Lars Wienbrandt
    • 1
  • Jan Christian Kässens
    • 1
  • Matthias Hübenthal
    • 1
  • David Ellinghaus
    • 1
  1. 1.Institute of Clinical Molecular BiologyUniversity Medical Center Schleswig-Holstein, Campus Kiel, Kiel UniversityKielGermany

Personalised recommendations