Soft Computing

, Volume 12, Issue 12, pp 1169–1183 | Cite as

GP on SPMD parallel graphics hardware for mega Bioinformatics data mining

Focus

Abstract

We demonstrate a SIMD C++ genetic programming system on a single 128 node parallel nVidia GeForce 8800 GTX GPU under RapidMind’s GPGPU Linux software by predicting ten year+ outcome of breast cancer from a dataset containing a million inputs. NCBI GEO GSE3494 contains hundreds of Affymetrix HG-U133A and HG-U133B GeneChip biopsies. Multiple GP runs each with a population of 5 million programs winnow useful variables from the chaff at more than 500 million GPops per second. Sources available via FTP.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Banzhaf W, Nordin P, Keller RE, Francone FD (1998) Genetic programming—an introduction; on the automatic evolution of computer programs and its applications. Morgan Kaufmann, San FranciscoMATHGoogle Scholar
  2. Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R (2007) NCBI GEO: mining tens of millions of expression profiles—database and tools update. Nucleic Acids Res 35(Database issue), January 2007Google Scholar
  3. Charalambous M, Trancoso P, Stamatakis A (2005) Initial experiences porting a bioinformatics application to a graphics processor. In: Advances in Informatics, 10th Panhellenic Conference on Informatics, PCI 2005, Volos, Greece, November 11–13, 2005, Proceedings, pp 415–425Google Scholar
  4. Chitty DM (2007) A data parallel approach to genetic programming using programmable graphics hardware. In: Thierens D, Beyer H-G, Bongard J, Branke J, Clark JA, Cliff D, Congdon CB, Deb K, Doerr B, Kovacs T, Kumar S, Miller JF, Moore J, Neumann F, Pelikan M, Poli R, Sastry K, Stanley KO, Stutzle T, Watson RA, Wegener I (eds) GECCO ’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, vol 2. ACM Press, London, pp 1566–1573CrossRefGoogle Scholar
  5. Ebner M, Reinhardt M, Albert J (2005) Evolution of vertex and pixel shaders. In: Keijzer M, Tettamanzi A, Collet P, van Hemert JI, Tomassiniarco M (eds) Proceedings of the 8th European conference on genetic programming. Lecture Notes in Computer Science, vol 3447. Springer, Lausanne, pp 261–270Google Scholar
  6. Fan Z, Qiu F, Kaufman A, Yoakum-Stover S (2004) GPU cluster for high performance computing. In: Proceedings of the ACM/IEEE SC2004 conference supercomputingGoogle Scholar
  7. Feller W (1957) An introduction to probability theory and its applications, 2nd edn, vol 1. Wiley, New YorkGoogle Scholar
  8. Fernando R (2004) GPGPU: general general-purpose purpose computation on GPUs. NVIDIA Developer Technology Group. SlidesGoogle Scholar
  9. Fok K-L, Wong T-T, Wong M-L (2007) Evolutionary computing on consumer graphics hardware. IEEE Int Syst 22(2): 69–78CrossRefGoogle Scholar
  10. Gobron S, Devillard F, Heit B (2007) Retina simulation using cellular automata and GPU programming. Mach Vision Appl (online first)Google Scholar
  11. Harding S, Banzhaf W (2007a) Fast genetic programming on GPUs. In: Ebner M, O’Neill M, Ekárt A, Vanneschi L, Esparcia-Alcázar AI (eds) Proceedings of the 10th European conference on genetic programming. Lecture Notes in Computer Science, vol 4445. Springer, Valencia, pp 90–101Google Scholar
  12. Harding SL, Banzhaf W (2007b) Fast genetic programming and artificial developmental systems on GPUs. In: Twenty-first international symposium on high performance computing systems and applications (HPCS’07), p 2. IEEE Computer Society, CanadaGoogle Scholar
  13. Harding SL, Miller JF, Banzhaf W (2007) Self-modifying cartesian genetic programming. In: Thierens D, Beyer H-G, Bongard J, Branke J, Clark JA, Cliff D, Congdon CB, Deb K, Doerr B, Kovacs T, Kumar S, Miller JF, Moore J, Neumann F, Pelikan M, Poli R, Sastry K, Stanley KO, Stutzle T, Watson RA, Wegener I (eds) GECCO ’07: proceedings of the 9th annual conference on genetic and evolutionary computation, vol 1. ACM Press, London, pp 1021–1028CrossRefGoogle Scholar
  14. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, CambridgeMATHGoogle Scholar
  15. Langdon WB, Banzhaf W (2008) A SIMD interpreter for genetic programming on GPU graphics cards. In: EuroGP, LNCS, Naples, 26–28 March 2008. Springer, Heidelberg (forthcoming)Google Scholar
  16. Langdon WB, Barrett SJ (2004) Genetic programming in data mining for drug discovery. In: Ghosh A, Jain LC (eds) Evolutionary computing in data mining. Studies in fuzziness and soft computing, vol 163, chap 10. Springer, Heidelberg, pp 211–235Google Scholar
  17. Langdon WB, Buxton BF (2004) Genetic programming for mining DNA chip data from cancer patients. Genet Program Evol Mach 5(3): 251–257CrossRefGoogle Scholar
  18. Langdon WB, Poli R (2002) Foundations of genetic programming. Springer, HeidelbergMATHCrossRefGoogle Scholar
  19. Langdon WB, da Silva Camargo R, Harrison AP (2007a) Spatial defects in 5896 HG-U133A GeneChips. In: Dopazo J, Conesa A, Al Shahrour F, Montener D (eds) Critical assesment of microarray data. ValenciaGoogle Scholar
  20. Langdon WB, Upton GJG, da Silva Camargo R, Harrison AP (2007) A survey of spatial defects in Homo sapiens affymetrix GeneChips. (in preparation)Google Scholar
  21. Langdon WB (1998) Genetic programming and data structures. Kluwer, BostonMATHGoogle Scholar
  22. Langdon WB (2007a) A SIMD interpreter for genetic programming on GPU graphics cards. Technical Report CSM-470, Department of Computer Science, University of Essex, ColchesterGoogle Scholar
  23. Langdon WB (2007b) PRNG random numbers on GPU. Technical Report CES-477, Computing and Electronic Systems, University of Essex, ColchesterGoogle Scholar
  24. Lindblad F, Nordin P, Wolff K (2002) Evolving 3D model interpretation of images using graphics hardware. In: Fogel DB, El-Sharkawi MA, Yao X, Greenwood G, Iba H, Marrow P, Shackleton M (eds) Proceedings of the 2002 congress on evolutionary computation CEC2002. IEEE Press, New York, pp 225–230CrossRefGoogle Scholar
  25. Liu W, Schmidt B, Voss G, Schroder A, Muller-Wittig W (2006) Bio-sequence database scanning on a GPU. In: twentieth International Parallel and Distributed Processing Symposium, IPDPS 2006. pp 8–, 25–29 April 2006Google Scholar
  26. Loviscach J, Meyer-Spradow J (2003) Genetic programming of vertex shaders. In: Chover M, Hagen H, Tost D (eds) Proceedings of EuroMedia, pp 29–31Google Scholar
  27. Luo Z, Liu H, Wu X (2005) Artificial neural network computation on graphic process unit. In: Proceedings of the 2005 IEEE international joint conference on neural networks, IJCNN ’05, number 1, pp 622–626Google Scholar
  28. Miller LD, Smeds J, George J, Vega VB, Vergara L, Ploner A, Pawitan Y, Hall P, Klaar S, Liu ET, Bergh J (2005) An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Nat Acad Sci 102(38): 13550–13555CrossRefGoogle Scholar
  29. Moore GE (1965) Cramming more components onto integrated circuits. Electronics 38(8): 114–117Google Scholar
  30. NVIDIA GeForce 8800 GPU architecture overview. Technical Brief TB-02787-001_v0.9, Nvidia Corporation, November 2006Google Scholar
  31. NVIDIA CUDA compute unified device architecture, programming guide. Technical Report version 0.8, NVIDIA, 12 Feb 2007Google Scholar
  32. Owens JD, Luebke D, Govindaraju N, Harris M, Kruger J, Lefohn AE, Purcell TJ (2007) A survey of general-purpose computation on graphics hardware. Comput Graph Forum 26(1): 80–113CrossRefGoogle Scholar
  33. Owens JD, Houston M, Luebke D, Green S, Stone JE, Phillips JC (2008) GPU computing. Proc IEEE 96(5)Google Scholar
  34. Pawitan Y, Bjohle J, Amler L, Borg A-L, Egyhazi S, Hall P, Han X, Holmberg L, Huang F, Klaar S, Liu ET, Miller L, Nordgren H, Ploner A, Sandelin K, Shaw PM, Smeds J, Skoog L, Wedren S, Bergh J (2005) Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Res 7: R953–R964CrossRefGoogle Scholar
  35. Price GR (1970) Selection and covariance. Nature 227: 520–521CrossRefGoogle Scholar
  36. Reggia J, Tagamets M, Contreras-Vidal J, Jacobs D, Weems S, Naqvi W, Winder R, Chabuk T, Jung J, Yang C (2006) Development of a large-scale integrated neurocognitive architecture—part 2: Design and architecture. Technical Report TR-CS-4827, UMIACS-TR-2006-43, University of Maryland, USAGoogle Scholar
  37. Rys. NVIDIA G80: Architecture and GPU analysis, 8 Nov 2006. Last updated: 25th Apr 2007Google Scholar
  38. Samsung. Graphics memory product guide. General information, Memory Division, Jan 2007Google Scholar
  39. Schatz MC, Trapnell C, Delcher AL, Varshney A (2007) High- throughput sequence alignment using graphics processing units. BMC Bioinform 8: 474CrossRefGoogle Scholar
  40. Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21): 2688–2690CrossRefGoogle Scholar
  41. Upton GJG, Cook I (2001) Introducing statistics, 2nd edn. Oxford University Press, OxfordGoogle Scholar
  42. Wu Z, Irizarry RA, Gentleman R, Martinez-Murillo F, Spencer F (2004) A model-based background adjustment for oligonucleotide expression arrays. J Am Stat Assoc 99(468): 909–917MATHCrossRefMathSciNetGoogle Scholar
  43. Yu J, Yu J, Almal AA, Dhanasekaran SM, Ghosh D, Worzel WP, Chinnaiyan AM (2007) Feature selection and molecular classification of cancer using genetic programming. Neoplasia 9(4): 292–303CrossRefGoogle Scholar
  44. Zipf GK (1949) Human behavior and the principle of least effort: an introduction to human ecology. Addison-Wesley Press Inc.Google Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  1. 1.Mathematical and Biological SciencesUniversity of EssexColchesterUK

Personalised recommendations