Visualizing Next-Generation Sequencing Cancer Data Sets with Cloud Computing
- 455 Downloads
With the advent of next-generation sequencing technology, clinical data sets now contain enormous amounts of valuable genomic information related to a wide range of diseases such as cancer. This data needs to be analysed, managed, stored, visualized and integrated in order to be clinically useful. However, many clinicians and researchers, who need to interpret these data sets, are non-specialists in the information technology domain and so need systems that are effective and easy to use. Herein, we present an overview of a novel cloud computing based next-generation sequencing research management software system which has simplicity, scalability, speed and reproducibility at its core. A prototype that enables rapid visualization of big data cancer sets is described. We present preliminary results from a bioinformatics pipeline for the Sage Care project, a European Union funded cancer research project, for comprehensive genome mapping analysis and visualization and outlined benefits of integrating this into a graphical user interface platform such as Simplicity.
Paul Walsh, Brian Kelly, Timm Heuss and Brendan Lawlor are investigators on Sage Care, a H2020 MCSA funded project, grant number 644186.
- 4.Grossman, R.: Managing and Analysing 1,000,000 Genomes, September 2012. http://rgrossman.com/2012/09/18/million-genomes-challeng
- 7.Mell, P., Grance, T.: The NIST definition of cloud computing, National Institute of Standards and Technology (2011). http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf
- 8.Hyek, P.: Cloud computing issues and impacts, Global Technology Industry Discussion Series, E&Y (2011). http://www.ey.com/Publication/vwLUAssets/Cloud_computing_issues,_impacts_and_insights/$File/Cloud%20computing%20issues%20and%20impacts_14Apr11.pdf
- 9.Shvachko, K.: The Hadoop distributed file system. In: 2010 IEEE 26th Symposium, Mass Storage Systems and Technologies (MSST). IEEE (2010)Google Scholar
- 11.Brooksbank, C., Cameron, G., Thornton, J.: The European Bioinformatics Institute’s data resources. Nucl. Acids Res. Advance Access (2009). doi: 10.1093/nar/gkp986
- 17.Lu, W., Jackson, J., Barga, R.: AzureBlast: a case study of developing science applications on the cloud. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (HPDC 2010), pp. 413–420. ACM, New York (2010). doi: 10.1145/1851476.1851537
- 27.Foster, I.: Globus online: accelerating and democratizing science through cloud-based services. In: Internet Computing. IEEE, May–June 2011Google Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.