DNA Sequence Search Using Content-Based Image Search Approach

  • Heri Ramampiaro
  • Aleksander Grande
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 93)


In this work, we investigate a new method to search DNA sequences based on multimedia retrieval approach. We try to address the issues related to index sizes and performance by first transforming the DNA sequences into images, and then index these images using content-based image indexing techniques. The main goal is to allow users retrieve similar gene sequences using stored image features rather than the sequence itself. We suggest two algorithms to do the conversions, each of which have been tested to reveal its sensitivity to both sequence length and sequence changes. We have also compared our approach to BLAST, which were used as a reference system. The result from our experiments has shown that this approach performed well with respect to size and speed, but more work must be done to improve it in terms of search sensitivity.


Sequence Length Index Size Naive Approach Search Sensitivity Search Speed 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. Journal of molecular biology 215(3), 403–410 (1990)Google Scholar
  2. 2.
    Altschul, S.F., et al.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research 25(17), 3389 (1997)CrossRefGoogle Scholar
  3. 3.
    Bray, N., Dubchak, I., Pachter, L.: AVID: A global alignment program. Genome research 13(1), 97–102 (2003)CrossRefGoogle Scholar
  4. 4.
    Brown, A.L.: Constructing chromosome scale suffix trees. In: Proceedings of the 2nd conference on Asia-Pacific bioinformatics, pp. 105–112. Australian Computer Society (2004)Google Scholar
  5. 5.
    Cao, X., Li, S.C., Tung, A.K.H.: Indexing DNA sequences using q-grams. In: Database Systems for Advanced Applications, vol. 3453, pp. 4–16. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  6. 6.
    Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 1–60 (2008)CrossRefGoogle Scholar
  7. 7.
    Dimitrova, N., Cheung, Y.H., Zhang, M.: Analysis and visualization of DNA spectrograms: open possibilities for the genome research. In: Proceedings of the 14th ACM International Multimedia Conference, pp. 1017–1024. ACM Press, New York (2006)CrossRefGoogle Scholar
  8. 8.
    Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD International Conference on Management of data, pp. 47–57. ACM Press, New York (1984)CrossRefGoogle Scholar
  9. 9.
    Hohl, M., Kurtz, S., Ohlebusch, E.: Efficient multiple genome alignment. Bioinformatics 18(Suppl. 1), S312 (2002)Google Scholar
  10. 10.
    Hunt, E., Atkinson, M.P., Irving, R.W.: A database index to large biological sequences. In: VLDB 2001: Proceedings of the 27th International Conference on Very Large Data Bases, pp. 139–148. Morgan Kaufmann Publishers, San Francisco (2001)Google Scholar
  11. 11.
    Kanz, C., et al.: The EMBL Nucleotide Sequence Database. Nucl. Acids Res. 33(1), D29–D33 (2005)Google Scholar
  12. 12.
    Lux, M., Chatzichristofis, S.A.: Lire: lucene image retrieval: an extensible java cbir library. In: Proceeding of the 16th ACM international conference on Multimedia, pp. 1085–1088. ACM, New York (2008)CrossRefGoogle Scholar
  13. 13.
    Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence comparison. Proceedings of the National Academy of Sciences of the United States of America 85(8), 2444–2448 (1988)CrossRefGoogle Scholar
  14. 14.
    Phoophakdee, B., Zaki, M.J.: Genome-scale disk-based suffix tree indexing. In: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, p. 833. ACM, New York (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Heri Ramampiaro
    • 1
  • Aleksander Grande
    • 1
  1. 1.Department of Computer and Information ScienceNorwegian University of Science and Technology (NTNU)TrondheimNorway

Personalised recommendations