Genomic Signatures in De Bruijn Chains

  • Lenwood S. Heath
  • Amrita Pati
Conference paper

DOI: 10.1007/978-3-540-74126-8_21

Part of the Lecture Notes in Computer Science book series (LNCS, volume 4645)
Cite this paper as:
Heath L.S., Pati A. (2007) Genomic Signatures in De Bruijn Chains. In: Giancarlo R., Hannenhalli S. (eds) Algorithms in Bioinformatics. WABI 2007. Lecture Notes in Computer Science, vol 4645. Springer, Berlin, Heidelberg

Abstract

Genomes have both deterministic and random aspects, with the underlying DNA sequences exhibiting features at numerous scales, from codons to regions of conserved or divergent gene order. This work examines the unique manner in which oligonucleotides fit together to comprise a genome, within a graph-theoretic setting. A de Bruijn chain (DBC) is a generalization of a finite Markov chain. A DNA word graph (DWG) is a generalization of a de Bruijn graph that records the occurrence counts of node and edges in a genomic sequence generated by a DBC. We combine the properties of DWGs and DBCs to obtain a powerful genomic signature demonstrated as information-rich, efficient, and sufficiently representative of the sequence from which it is derived. We illustrate its practical value in distinguishing genomic sequences and predicting the origin of short DNA sequences of unknown origin, while highlighting its superior performance compared to existing genomic signatures including the dinucleotides odds ratio.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Lenwood S. Heath
    • 1
  • Amrita Pati
    • 1
  1. 1.Department of Computer Science, Virginia Tech, Blacksburg, VA 24061-0106 

Personalised recommendations