Skip to main content

Large Scale Bioinformatics Data Mining with Parallel Genetic Programming on Graphics Processing Units

  • Chapter

Part of the book series: Studies in Computational Intelligence ((SCI,volume 269))

Abstract

A suitable single instruction multiple data GP interpreter can achieve high (Giga GPop/second) performance on a SIMD GPU graphics card by simultaneously running multiple diverse members of the genetic programming population. SPMD dataflow parallelisation is achieved because the single interpreter treats the different GP programs as data. On a single 128 node parallel nVidia GeForce 8800 GTX GPU, the interpreter can out run a compiled approach, where data parallelisation comes only by running a single program at a time across multiple inputs.

The RapidMind GPGPU Linux C++ system has been demonstrated by predicting ten year+ outcome of breast cancer from a dataset containing a million inputs. NCBI GEO GSE3494 contains hundreds of Affymetrix HG-U133A and HG-U133B GeneChip biopsies. Multiple GP runs each with a population of five million programs winnow useful variables from the chaff at more than 500 million GPops per second. Sources available via FTP.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Banzhaf, W., Harding, S., Langdon, W.B., Wilson, G.: Accelerating genetic programming through graphics processing units. In: Genetic Programming Theory and Practice VI, May 15-17, ch. 15. Springer, Ann Arbor (2008)

    Google Scholar 

  2. Banzhaf, W., Nordin, P., Keller, R.E., Francone, F.D.: Genetic Programming – An Introduction. Morgan Kaufmann, San Francisco (1998)

    MATH  Google Scholar 

  3. Barrett, T., Troup, D.B., Wilhite, S.E., Ledoux, P., Rudnev, D., Evangelista, C., Kim, I.F., Soboleva, A., Tomashevsky, M., Edgar, R.: NCBI GEO: mining tens of millions of expression profiles–database and tools update. Nucleic Acids Research 35(Database issue), D760–D765 (2007)

    Google Scholar 

  4. Charalambous, M., Trancoso, P., Stamatakis, A.: Initial experiences porting a bioinformatics application to a graphics processor. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 415–425. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Chitty, D.M.: A data parallel approach to genetic programming using programmable graphics hardware. In: Thierens, D., et al. (eds.) GECCO 2007: Proceedings of the 9th annual conference on Genetic and evolutionary computation, London, July 7-11, vol. 2, pp. 1566–1573. ACM Press, New York (2007)

    Chapter  Google Scholar 

  6. Corney, D.P.A.: Intelligent Analysis of Small Data Sets for Food Design. PhD thesis, University College, London (2002)

    Google Scholar 

  7. Dowsey, A.W., Dunn, M.J., Yang, G.-Z.: Automated image alignment for 2D gel electrophoresis in a high-throughput proteomics pipeline. Bioinformatics 24(7), 950–957 (2008)

    Article  Google Scholar 

  8. Ebner, M., Reinhardt, M., Albert, J.: Evolution of vertex and pixel shaders. In: Keijzer, M., Tettamanzi, A.G.B., Collet, P., van Hemert, J., Tomassini, M. (eds.) EuroGP 2005. LNCS, vol. 3447, pp. 261–270. Springer, Heidelberg (2005)

    Google Scholar 

  9. Fan, Z., Qiu, F., Kaufman, A., Yoakum-Stover, S.: GPU cluster for high performance computing. In: Proceedings of the ACM/IEEE SC2004 Conference Supercomputing (2004)

    Google Scholar 

  10. Feller, W.: An Introduction to Probability Theory and Its Applications, 2nd edn., vol. 1. John Wiley and Sons, Chichester (1957)

    MATH  Google Scholar 

  11. Fernando, R.: GPGPU: general general-purpose purpose computation on GPUs. NVIDIA Developer Technology Group. Slides (2004)

    Google Scholar 

  12. Fok, K.-L., Wong, T.-T., Wong, M.-L.: Evolutionary computing on consumer graphics hardware. IEEE Intelligent Systems 22(2), 69–78 (2007)

    Article  Google Scholar 

  13. Gobron, S., Devillard, F., Heit, B.: Retina simulation using cellular automata and GPU programming. Machine Vision and Applications (2007)

    Google Scholar 

  14. Harding, S.L., Banzhaf, W.: Fast genetic programming and artificial developmental systems on GPUs. In: 21st International Symposium on High Performance Computing Systems and Applications (HPCS 2007), Canada, p. 2. IEEE Press, Los Alamitos (2007)

    Chapter  Google Scholar 

  15. Harding, S.: Evolution of image filters on graphics processor units using Cartesian genetic programming. In: Wang, J. (ed.) 2008 IEEE World Congress on Computational Intelligence, Hong Kong, June 1-6, IEEE Press, Los Alamitos (2008)

    Google Scholar 

  16. Harding, S., Banzhaf, W.: Fast genetic programming on GPUs. In: Ebner, M., O’Neill, M., Ekárt, A., Vanneschi, L., Esparcia-Alcázar, A.I. (eds.) EuroGP 2007. LNCS, vol. 4445, pp. 90–101. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  17. Harding, S.L., Miller, J.F., Banzhaf, W.: Self-modifying Cartesian genetic programming. In: Thierens, D., et al. (eds.) GECCO 2007: Proceedings of the 9th annual conference on Genetic and evolutionary computation, London, July 7-11, vol. 1, pp. 1021–1028. ACM Press, New York (2007)

    Chapter  Google Scholar 

  18. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  19. Langdon, W.B.: Genetic Programming and Data Structures. Kluwer, Dordrecht (1998)

    MATH  Google Scholar 

  20. Langdon, W.B.: A SIMD interpreter for genetic programming on GPU graphics cards. Technical Report CSM-470, Department of Computer Science, University of Essex, Colchester, UK, July 3 (2007)

    Google Scholar 

  21. Langdon, W.B.: Evolving GeneChip correlation predictors on parallel graphics hardware. In: Wang, J. (ed.) 2008 IEEE World Congress on Computational Intelligence, Hong Kong, June 1-6, pp. 4152–4157. IEEE Press, Los Alamitos (2008)

    Google Scholar 

  22. Langdon, W.B.: A fast high quality pseudo random number generator for graphics processing units. In: Wang, J. (ed.) 2008 IEEE World Congress on Computational Intelligence, Hong Kong, June 1-6, pp. 459–465. IEEE Press, Los Alamitos (2008)

    Chapter  Google Scholar 

  23. Langdon, W.B., Barrett, S.J.: Genetic programming in data mining for drug discovery. In: Ghosh, A., Jain, L.C. (eds.) Evolutionary Computing in Data Mining. Studies in Fuzziness and Soft Computing, ch. 10, vol. 163, pp. 211–235. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  24. Langdon, W.B., Buxton, B.F.: Genetic programming for mining DNA chip data from cancer patients. Genetic Programming and Evolvable Machines 5(3), 251–257 (2004)

    Article  Google Scholar 

  25. Langdon, W.B., da Silva Camargo, R., Harrison, A.P.: Spatial defects in 5896 HG-U133A GeneChips. In: Dopazo, J., Conesa, A., Al Shahrour, F., Montener, D. (eds.) Critical Assesment of Microarray Data, Valencia, December 13-14 (2007); Presented at EMERALD Workshop

    Google Scholar 

  26. Langdon, W.B., Harrison, A.P.: GP on SPMD parallel graphics hardware for mega bioinformatics data mining. Soft Computing 12(12), 1169–1183 (2008)

    Article  Google Scholar 

  27. Langdon, W.B., Poli, R.: Foundations of Genetic Programming. Springer, Heidelberg (2002)

    MATH  Google Scholar 

  28. Langdon, W.B., Upton, G.J.G., da Silva Camargo, R., Harrison, A.P.: A survey of spatial defects in Homo Sapiens Affymetrix GeneChips. IEEE/ACM Transactions on Computational Biology and Bioinformatics (in press, 2009)

    Google Scholar 

  29. Langdon, W.B., Banzhaf, W.: A SIMD interpreter for genetic programming on GPU graphics cards. In: O’Neill, M., Vanneschi, L., Gustafson, S., Esparcia Alcázar, A.I., De Falco, I., Della Cioppa, A., Tarantino, E. (eds.) EuroGP 2008. LNCS, vol. 4971, pp. 73–85. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  30. Lindblad, F., Nordin, P., Wolff, K.: Evolving 3D model interpretation of images using graphics hardware. In: Fogel, D.B., et al. (eds.) Proceedings of the 2002 Congress on Evolutionary Computation, CEC 2002, pp. 225–230. IEEE Press, Los Alamitos (2002)

    Chapter  Google Scholar 

  31. Liu, W., Schmidt, B., Voss, G., Schroder, A., Muller-Wittig, W.: Bio-sequence database scanning on a GPU. In: 20th International Parallel and Distributed Processing Symposium, IPDPS 2006, April 25-29. IEEE Press, Los Alamitos (2006)

    Google Scholar 

  32. Liu, Y., De Suvranu: CUDA-based real time surgery simulation. Studies in Health Technology and Informatics 132, 260–262 (2008)

    Google Scholar 

  33. Loviscach, J., Meyer-Spradow, J.: Genetic programming of vertex shaders. In: Chover, M., Hagen, H., Tost, D. (eds.) Proceedings of EuroMedia 2003, pp. 29–31 (2003)

    Google Scholar 

  34. Luo, Z., Liu, H., Wu, X.: Artificial neural network computation on graphic process unit. In: Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, IJCNN 2005, July-4 August 2005, vol. 1, pp. 622–626 (2005)

    Google Scholar 

  35. Manavski, S., Valle, G.: CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment. BMC Bioinformatics 9(suppl. 2), S10 (2008)

    Google Scholar 

  36. Meyer-Spradow, J., Loviscach, J.: Evolutionary design of BRDFs. In: Chover, M., Hagen, H., Tost, D. (eds.) Eurographics 2003 Short Paper Proceedings, pp. 301–306 (2003)

    Google Scholar 

  37. Miller, L.D., Smeds, J., George, J., Vega, V.B., Vergara, L., Ploner, A., Pawitan, Y., Hall, P., Klaar, S., Liu, E.T., Bergh, J.: An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proceedings of the National Academy of Sciences 102(38), 13550–13555 (2005)

    Article  Google Scholar 

  38. Moore, G.E.: Cramming more components onto integrated circuits. Electronics 38(8), 114–117 (1965)

    Google Scholar 

  39. NVIDIA GeForce 8800 GPU architecture overview. Technical Brief TB-02787-001_v0.9, Nvidia Corporation (November 2006)

    Google Scholar 

  40. NVIDIA CUDA compute unified device architecture, programming guide. Technical Report version 0.8, NVIDIA, February 12 (2007)

    Google Scholar 

  41. Owens, J.: Experiences with GPU computing. Presentation slides (2007)

    Google Scholar 

  42. Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. Proceedings of the IEEE 96(5), 879–899 (2008); invited paper

    Article  Google Scholar 

  43. Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Kruger, J., Lefohn, A.E., Purcell, T.J.: A survey of general-purpose computation on graphics hardware. Computer Graphics Forum 26(1), 80–113 (2007)

    Article  Google Scholar 

  44. Pawitan, Y., Bjohle, J., Amler, L., Borg, A.-L., Egyhazi, S., Hall, P., Han, X., Holmberg, L., Huang, F., Klaar, S., Liu, E.T., Miller, L., Nordgren, H., Ploner, A., Sandelin, K., Shaw, P.M., Smeds, J., Skoog, L., Wedren, S., Bergh, J.: Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Research 7, R953–R964 (2005)

    Google Scholar 

  45. Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming (2008), http://lulu.com , http://www.gp-field-guide.org.uk (With contributions by J. R. Koza)

  46. Price, G.R.: Selection and covariance. Nature 227, 520–521 (1970)

    Article  Google Scholar 

  47. Reggia, J., Tagamets, M., Contreras-Vidal, J., Jacobs, D., Weems, S., Naqvi, W., Winder, R., Chabuk, T., Jung, J., Yang, C.: Development of a large-scale integrated neurocognitive architecture - part 2: Design and architecture. Technical Report TR-CS-4827, UMIACS-TR-2006-43, University of Maryland, USA (October 2006)

    Google Scholar 

  48. Robilliard, D., Marion-Poty, V., Fonlupt, C.: Population parallel GP on the G80 GPU. In: O’Neill, M., Vanneschi, L., Gustafson, S., Esparcia Alcázar, A.I., De Falco, I., Della Cioppa, A., Tarantino, E. (eds.) EuroGP 2008. LNCS, vol. 4971, pp. 98–109. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  49. Schatz, M.C., Trapnell, C., Delcher, A.L., Varshney, A.: High-throughput sequence alignment using graphics processing units. BMC Bioinformatics 8, 474 (2007)

    Article  Google Scholar 

  50. Stamatakis, A.: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21), 2688–2690 (2006)

    Article  Google Scholar 

  51. Upton, G.J.G., Cook, I.: Introducing Statistics, 2nd edn. Oxford University Press, Oxford (2001)

    Google Scholar 

  52. Wilson, G., Banzhaf, W.: Linear genetic programming GPGPU on Microsoft’s Xbox 360. In: Wang, J. (ed.) 2008 IEEE World Congress on Computational Intelligence, Hong Kong, June1-6. IEEE Press, Los Alamitos (2008)

    Google Scholar 

  53. Wilson, G., Harding, S.: WCCI 2008 special session: Computational intelligence on consumer games and graphics hardware (CIGPU-2008). SIGEvolution 3(1), 19–21 (2008)

    Google Scholar 

  54. Wirawan, A., Kwoh, C., Hieu, N., Schmidt, B.: CBESW: sequence alignment on the PlayStation 3. BMC Bioinformatics 9(1), 377 (2008)

    Article  Google Scholar 

  55. Wu, Z., Irizarry, R.A., Gentleman, R., Martinez-Murillo, F., Spencer, F.: A model-based background adjustment for oligonucleotide expression arrays. Journal of the American Statistical Association 99(468), 909–917 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  56. Yu, J., Yu, J., Almal, A.A., Dhanasekaran, S.M., Ghosh, D., Worzel, W.P., Chinnaiyan, A.M.: Feature selection and molecular classification of cancer using genetic programming. Neoplasia 9(4), 292–303 (2007)

    Article  Google Scholar 

  57. Zipf, G.K.: Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley Press Inc., Reading (1949)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Langdon, W.B. (2010). Large Scale Bioinformatics Data Mining with Parallel Genetic Programming on Graphics Processing Units. In: de Vega, F.F., Cantú-Paz, E. (eds) Parallel and Distributed Computational Intelligence. Studies in Computational Intelligence, vol 269. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10675-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10675-0_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10674-3

  • Online ISBN: 978-3-642-10675-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics