HASV: Hadoop-Based NGS Analyzer for Predicting Genomic Structure Variations
The NGS technology produces large scale biologic data sets much cheaper and faster than the previous methods. As it is almost impossible to store or analyze such large scale NGS data with a traditional method on a commodity server, many problems arise. Hadoop is an alternative to this requirement. We aim to address the issues involved in the large scale data analysis on the cloud in bioinformatics. Accordingly, we propose analysis service for predicting genome structural variations associated with diseases by using Hadoop. The result of this study reveals that the system proposed in this study efficiently predicts genomic variations from large scale data sets.
KeywordsStructural Variation Fragment Size Insert Size Large Scale Data Analysis Commodity Server
Unable to display preview. Download preview PDF.
- 1.Xia, J., Wang, Q., Jia, P., Wang, B., Pao, W., Zhao, Z.: NGS catalog: A database of next generation sequencing studies in humans. Hum. Mutat. 33, E2341–E2355 (2012)Google Scholar
- 2.Chen, K., Wallis, J.W., McLellan, M.D., Larson, D.E., Kalicki, J.M., Pohl, C.S., McGrath, S.D., Wendl, M.C., Zhang, Q., Locke, D.P., Shi, X., Fulton, R.S., Ley, T.J., Wilson, R.K., Ding, L., Mardis, E.R.: BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat. Methods 6, 677–681 (2009)CrossRefGoogle Scholar
- 4.Medvedev, P., Stanciu, M., Brudno, M.: Computational methods for discovering structural variation with next-generation sequencing. Nat. Methods 6, S13–S20 (2009)Google Scholar
- 6.Duclos, A., Charbonnier, F., Chambon, P., Latouche, J.B., Blavier, A., Redon, R., Frebourg, T., Flaman, J.M.: Pitfalls in the use of DGV for CNV interpretation. Am. J. Med. Genet. A 155A, 2593–2596 (2011)Google Scholar