Personal Genomes: A New Frontier in Database Research
- Cite this paper as:
- Saito T.L. (2011) Personal Genomes: A New Frontier in Database Research. In: Kikuchi S., Madaan A., Sachdeva S., Bhalla S. (eds) Databases in Networked Information Systems. DNIS 2011. Lecture Notes in Computer Science, vol 7108. Springer, Berlin, Heidelberg
Due to the recent technological improvement of the next-generation sequencers, reading genome sequence of individual DNA becomes popular in biology and medical study. The amount of data produced by next generation sequencers is enormous. Today, more than 10,000 people’s DNAs are sequenced in the world and tera-bytes of data are being produced in a daily basis. The types of genome information also vary according to the biological experiments used for preparing DNA samples. Biologists and medical scientists are now facing to manage these huge volumes of data with variety of types. Existing DBMS, whose major targets are business applications, is not suited to managing these biological data because storing such large data to DBMS is time-consuming, and also current database queries cannot accommodate various types of bioinformatics tools written in various programming languages. Processing bioinformatics workflows in parallel and distributed manner is also a challenging problem. In this paper, in hope of recruiting database researchers into this rapidly progressing biology and medical research area, we introduce several challenges in genome informatics from the viewpoint of using existing DBMS for processing next-generation sequencer data.
KeywordsPersonal genomes bioinformatics parallel computing workflow management
Unable to display preview. Download preview PDF.