Indexing Finite Language Representation of Population Genotypes

  • Jouni Sirén
  • Niko Välimäki
  • Veli Mäkinen
Conference paper

DOI: 10.1007/978-3-642-23038-7_23

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6833)
Cite this paper as:
Sirén J., Välimäki N., Mäkinen V. (2011) Indexing Finite Language Representation of Population Genotypes. In: Przytycka T.M., Sagot MF. (eds) Algorithms in Bioinformatics. WABI 2011. Lecture Notes in Computer Science, vol 6833. Springer, Berlin, Heidelberg

Abstract

We propose a way to index population genotype information together with the complete genome sequence, so that one can use the index to efficiently align a given sequence to the genome with all plausible genotype recombinations taken into account. This is achieved through converting a multiple alignment of individual genomes into a finite automaton recognizing all strings that can be read from the alignment by switching the sequence at any time. The finite automaton is indexed with an extension of Burrows-Wheeler transform to allow pattern search inside the plausible recombinant sequences. The size of the index stays limited, because of the high similarity of individual genomes. The index finds applications in variation calling and in primer design.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jouni Sirén
    • 1
  • Niko Välimäki
    • 1
  • Veli Mäkinen
    • 1
  1. 1.Helsinki Institute for Information Technology (HIIT) &, Department of Computer ScienceUniversity of HelsinkiFinland

Personalised recommendations