International Workshop on Algorithms in Bioinformatics

WABI 2015: Algorithms in Bioinformatics pp 286-295

Higher Classification Accuracy of Short Metagenomic Reads by Discriminative Spaced k-mers

Conference paper

DOI: 10.1007/978-3-662-48221-6_21

Part of the Lecture Notes in Computer Science book series (LNCS, volume 9289)
Cite this paper as:
Ounit R., Lonardi S. (2015) Higher Classification Accuracy of Short Metagenomic Reads by Discriminative Spaced k-mers. In: Pop M., Touzet H. (eds) Algorithms in Bioinformatics. WABI 2015. Lecture Notes in Computer Science, vol 9289. Springer, Berlin, Heidelberg


The growing number of metagenomic studies in medicine and environmental sciences is creating new computational demands in the analysis of these very large datasets. We have recently proposed a time-efficient algorithm called Clark that can accurately classify metagenomic sequences against a set of reference genomes. The competitive advantage of Clark depends on the use of discriminative contiguousk-mers. In default mode, Clark’s speed is currently unmatched and its precision is comparable to the state-of-the-art, however, its sensitivity still does not match the level of the most sensitive (but slowest) metagenomic classifier. In this paper, we introduce an algorithmic improvement that allows Clark’s classification sensitivity to match the best metagenomic classifier, without a significant loss of speed or precision compared to the original version. Finally, on real metagenomes, Clark can assign with high accuracy a much higher proportion of short reads than its closest competitor. The improved version of Clark, based on discriminative spacedk-mers, is freely available at


Metagenomics Microbiome Classification Discriminative spaced k-mers Short metagenomic reads 

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringUniversity of CaliforniaRiversideUSA

Personalised recommendations