Digital Signal Processing Techniques for Gene Finding in Eukaryotes

  • Mahmood Akhtar
  • Eliathamby Ambikairajah
  • Julien Epps
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5099)


In this paper, we investigate the effects of window shape and length on a DFT-based method for gene and exon prediction in eukaryotes. We then propose a new gene finding method which combines the selected time-domain and frequency-domain methods, by employing the most effective DNA symbolic-to-numeric representation examined to date in conjunction with suitable window shape and length parameters and a signal boosting technique. It is shown herein that the new method outperforms major existing approaches. By comparison with the existing methods, the proposed method reveals relative improvements of 15.1% to 55.9% over different methods in terms of prediction accuracy of exonic nucleotides at a 5% false positive rate using the GENSCAN test set.


DNA periodicity discrete Fourier transforms signal boosting 


  1. 1.
    Anastassiou, D.: Genomic Signal Processing. IEEE Signal Proc. Mag. 18(4), 8–20 (2001)CrossRefGoogle Scholar
  2. 2.
    Akhtar, M., Epps, J., Ambikairajah, E.: On DNA Numerical Representations for Period-3 Based Exon Prediction. In: 5th IEEE Workshop on Genomic Signal Processing and Statistics, Tuusula, Finland (2007)Google Scholar
  3. 3.
    Voss, R.F.: Evaluation of Long-range Fractal Correlations and 1/f Noise in DNA Base Sequences. Phy. Rev. Lett. 68(25), 3805–3808 (1992)CrossRefGoogle Scholar
  4. 4.
    Fickett, J.W.: Recognition of Protein Coding Regions in DNA Sequences. Nucleic Acids Res. 10, 5303–5318 (1982)CrossRefGoogle Scholar
  5. 5.
    Tiwari, S., Ramaswamy, S., Bhattacharya, A., Bhattacharya, S., Ramaswamy, R.: Prediction of Probable genes by Fourier Analysis of Genomic Sequences. Comput. Appl. Biosci. 13, 263–270 (1997)Google Scholar
  6. 6.
    Kotlar, D., Lavner, Y.: Gene Prediction by Spectral Rotation Measure: A New Method for Identifying Protein Coding Regions. Genome Res. 18, 1930–1937 (2003)Google Scholar
  7. 7.
    Akhtar, M., Epps, J., Ambikairajah, E.: Time and Frequency Domain Methods for Gene and Exon Prediction in Eukaryotes. In: IEEE ICASSP, pp. 573–576 (2007)Google Scholar
  8. 8.
    Ambikairajah, E., Epps, J., Akhtar, M.: Gene and Exon Prediction using Time-Domain Algorithms. In: 8th IEEE Int. Symp. on Sig. Proc. and its Appl., pp. 199–202 (2005)Google Scholar
  9. 9.
    Datta, S., Asif, A.: A Fast DFT Based Gene Prediction Algorithm for Identification of Protein Coding Regions. In: IEEE ICASSP, pp. 653–656 (2005)Google Scholar
  10. 10.
    Rogic, S., Mackworth, A.K., Ouellette, B.F.: Evaluation of Gene-Finding Programs on Mammalian Sequences. Genome Res. 11(5), 817–832 (2001)CrossRefGoogle Scholar
  11. 11.
    Akhtar, M., Ambikairajah, E., Epps, J.: Optimizing Period-3 Methods for Eukaryotic Gene Prediction. In: IEEE ICASSP, pp. 621–624 (2008)Google Scholar
  12. 12.
    Gunawan, T.S., Ambikairajah, E., Epps, J.: A Boosting Approach to Exon Detection in DNA Sequences. IEEE Electronic Letters 44(4), 323–324 (2008)CrossRefGoogle Scholar
  13. 13.
    Burge, C.: Identification of Genes in Human Genomic DNA. PhD Thesis Stanford University, Stanford, CA, USA (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Mahmood Akhtar
    • 1
    • 2
  • Eliathamby Ambikairajah
    • 2
  • Julien Epps
    • 2
  1. 1.Centre for Health InformaticsUniversity of New South WalesCoogeeAustralia
  2. 2.School of EE&TUniversity of New South WalesSydneyAustralia

Personalised recommendations