Abstract
Genomic researches are concerned with the study of genomes of organisms. It has become a challenge to the researchers to identify the segments within the DNA sequence that involved in protein synthesis and called coding region of gene. The methods are generally used to identify the segment that relies on period-3 property of genes. This period-3 property easily can be identified by digital signal processing with great accuracy. Prior to DSP application in gene prediction a conversion rule is required which converts symbolic DNA (ATCGTC…) sequence into numerical representations. Accuracy of gene prediction depends on mapping rule. The effectiveness of mapping rule depends on the application area of genomics. Some mapping rule works well in gene prediction may not performed good in genetic disease prediction. Most of the available conversion rules are fixed mapping technique. In this paper a new conversion rule is proposed prior to DSP application and a polyphase filter is used to suppress the noise in the DNA spectrum. The performance of the proposed mapping is compared with existing mapping and also the performance of the polyphase filtering method is compared with existing filtering methods in terms of signal to noise ratio (SNR) and location accuracy.
Similar content being viewed by others
Abbreviations
- DSP:
-
Digital signal processing
- DNA:
-
Deoxyribo nucleic acid
- mRNA:
-
Messenger ribo nucleic acid
- PSD:
-
Power spectral density
- DFT:
-
Discrete Fourier transform
- IIR:
-
Infinite impulse response
- FIR:
-
Finite impulse response
- LPF:
-
Low pass filter
- BPF:
-
Band pass filter
- SNR:
-
Signal to noise ratio
References
Abo-Zahhad M, Ahmed SM, Abd-Elrahman SA (2012) Genomic analysis and classification of exon and intron sequences using DNA numerical mapping techniques. Int J Inf Technol Comput Sci 4(8):22–36
Akhtar M, Epps J and Ambikairajah E (2007) On DNA numerical representations for period-3 based exon prediction. In: Proceedings of IEEE workshop on genomic signal processing and statistics (GENSIPS), pp. 1–4
Akhtar M, Epps J, Ambikairajah E (2008) Signal processing in sequence analysis: advances in eukaryotic gene prediction. IEEE J Sel Topics Signal Process 2(3):310–321
Alberts B, Bray D, Johnson A, Lewis J, Raff M, Roberts K, Walter P (1998) Essential cell biology. Garland Publishing Inc., New York
Anastassiou D (2000) Frequency–domain analysis of bimolecular sequences. Bioinformatics 16:1073–1081
Anastassiou D (2001a) DSP in genomics: Processing and frequency domain analysis of character strings. IEEE-7803-7041-2001
Anastassiou D (2001b) Genomic signal processing. IEEE Signal Process Mag 18:8–20
Barman (Mandal) S, Biswas S, Das S and Roy M (2012) Performance analysis and Simulation of IIR anti-notch filter with various structures for gene predication application. 5th International Conference on Computer and Devices for Communication (CODEC)
Bellanger M, Bonnerot G, Coudreuse M (1976) Digital filtering by polyphase network: application to sample rate alteration and filter banks. IEEE Trans. Acoustic Speech Signal Proc. 24:109–114
Chakravarthy N, Spanias A, Iasemidis LD, Tsakalis K (2004) Autoregressive modeling and feature analysis of DNA sequences. EURASIP J Appl Sig Process 1:13–28
Crick FH, Watson JD (1953) Molecular structure of nucleic acids. Nature 171(4356):737–738
Cristea P D (2002a) Genetic signal representation and analysis. In: Proceedings of SPIEConference. International Biomedical Optics Symposium (BIOS’02), vol. 4623, pp 77–84
CristeaP D (2002) Conversion of nucleotides sequences into genomic signals. J Cell Mol Med 6:279–303
Epps J, Ambikairajah E and Akhtar M (2008) An integer period DFT for biological sequence processing. In: Proceedings of the IEEE International Workshop on Genomic Signal Processing and Statistics GENSIPS, pp 1–4
Ficket JW, Tung CS (1982) Recognition of protein coding regions in DNA sequences. Nucleic Acids Res 10(17):5303–5318
Grandhi D G and Vijaykumar C (2007) 2-Simplex Mapping for Identifying the Protein Coding Regions in DNA [C]. TENCON-2007, Taiwan, 530
Holden T, Subramaniam R, Sullivan R, Cheng E, Sneider C, Tremberger G, Flamholz JA, Leiberman DH, Cheung TD (1992) A TCG nucleotide fluctuation of Deinococcusradiodurans radiation genes. Proceedings of Society of Photo-Optical Nature, San Diego 168
Inbamalar T M and Sivakumar R (2015) Improved Algorithm for Analysis of DNA Sequences Using Multiresolution Transformation. Scientific World J
Kakumani R and Devabhaktuni V (2008) Prediction of Protein Coding Regions in DNA Sequence using a model based approach. IEEE explore. doi:978-1-4244-1684-4/08, pp 1918–1920
Liu G and Luan Y (2014) Identification of protein coding regions in the eukaryotic DNA sequences based on Marple algorithm and wavelet packets transform. In Abstract and Applied Analysis (Vol. 2014). Hindawi Publishing Corporation
National Centre for Biotechnology Information (NCBI). http://www.ncbi.nlm.nih. Accessed on 28 June 2015
Ning J, Moore C N and Nelson J C (2003) Preliminary wavelet analysis of genomic sequences. Proc. IEEE Bioinformatics Conf. (CSB), pp 509–510
Rao N, Shepherd SJ (2004) Detection of 3-periodicity for small genomic sequences based on AR technique. Proc Int Conf Commun Circuits Syst ICCCAS 2:1032–1036
Roy M, Biswas S and Barman (Mandal) S (2009) Identification and analysis of coding and non-coding regions of a DNA sequence by Positional Frequency Distribution of Nucleotides (PFDN) algorithm. International Conference on Computers and Devices for Communication (CODEC)
Sahu, S S and Panda G (2010) An efficient signal processing approach in eukaryotic gene prediction. In: Proceeding of 8th Asia Pacific Bioinformatic Conference (APBC), Bangalore, pp 1–12
Silverman BD, Linker R (1986) A measure of DNA periodicity [J]. Theor Biol 118:295–300
Singha Roy S, Barman S (2014) Identification of protein coding region of DNA sequence using multirate filter. Computational Advan Commun Circuits Syst. doi:10.1007/978-81-322-2274-3_16 (Lecture Notes in Electrical Engineering)
Tiwari S, Ramachandran S, Bhattacharya A, Bhattacharya S, Ramaswamy R (1997) Prediction of probable genes by Fourier analysis of genomic sequences. CABIOS 3(3):263–270
Vaidyanathan PP (1990) Multirate digital filters filter banks, polyphase networks, and applications: a tutorial. Proc IEEE 78(1):56–93
Vaidyanathan PP (2004) Genomics and proteomics: a signal processor’s tour. Circuits Syst Mag IEEE 4(4):6–29
Vaidyanathan P P and Yoon B J (2004) The role of signal-processing concepts in genomics and proteomics. J Franklin Inst (Special issue on Genomics)
Voss RF (1992) Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. Phys Rev Lett 68(25):3805–3808
Yin C, Stephen S, Yau T (2007) Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence. J Theor Biol 247:687–694
Zhang R, Zhang CT (1004) Z curves, An Intuitive Tool, for Visualizing and Analyzing the DNA sequences. J Biomol Struct Dyn 11:767–782
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Singha Roy, S., Barman, S. Polyphase filtering with variable mapping rule in protein coding region prediction. Microsyst Technol 23, 4111–4121 (2017). https://doi.org/10.1007/s00542-016-2884-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00542-016-2884-5