Abstract
The exceptional speed in increase of genomic data at public databases requires advanced computational tools to perform quick gene analysis. The tools can be devised with the aid of genomic signal processing. The pivotal task in genomic signal processing is numerical mapping. In numerical mapping, the string of nucleotides is transformed into discrete numerical sequence by assigning optimum mathematical descriptor to a nucleotide. The descriptor must be compatible with the further stages of genomic application in order to achieve high efficiency. In this work, a simple numerical mapping method is proposed in which the optimum descriptor value is obtained by applying Gray code concept. The proposed method is evaluated on benchmark databases HRM195 and ASP67 for an identification of protein coding region application. The proposed method exhibits improved exon prediction efficiency in terms of performance accuracy and equal error rate when compared with similar methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Vaidyanathan, P.P., Yoon, B.J.: The role of signal-processing concepts in genomics and proteomics. J. Franklin Inst. 341(1–2), 111–135 (2004)
Anastassiou, D.: Genomic signal processing. IEEE Signal Process. Mag. 18, 8–20 (2001)
Akhtar, M., Epps, J., Ambikairajah, E.: On DNA numerical representations for period-3 based exon prediction. In: GENSIPS’07—5th IEEE International Workshop on Genomic Signal Processing and Statistics (2007)
Ahmad, M., Jung, L.T., Bhuiyan, A.A.: A biological inspired fuzzy adaptive window median filter (FAWMF) for enhancing DNA signal processing. Comput. Methods Programs Biomed. 149, 11–17 (2017)
Marhon, S.A., Kremer, S.C.: Prediction of protein coding regions using a wide-range wavelet window method. IEEE/ACM Trans. Comput. Biol. Bioinform. 13(4), 742–753 (2016)
Rao, K.D., Swamy, M.N.S.: Analysis of genomics and proteomics using DSP techniques. IEEE Trans. Circuits Syst. I Regul. Pap. 55(1), 370–378 (2008)
Yu, N., Li, Z., Yu, Z.: Survey on encoding schemes for genomic data representation and feature learning—from signal processing to machine learning. Big Data Min. Anal. 1(3), 191–210 (2018)
Das, B., Turkoglu, I.: A novel numerical mapping method based on entropy for digitizing DNA sequences. Neural Comput. Appl. 29(8), 207–215 (2018)
Mo, Z., et al.: One novel representation of DNA sequence based on the global and local position information. Sci. Rep. 8(1), 1–7 (2018)
Singha Roy, S., Barman, S.: Polyphase filtering with variable mapping rule in protein coding region prediction. Microsyst. Technol. 23(9), 4111–4121 (2017)
Voss, R.F.: Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. Phys. Rev. Lett. 68(25), 3805–3808 (1992)
Cristea, P.D.: Genetic signal representation and analysis. In: Proc. SPIE Conference on International Symposium on Biomedical Optics (BIOS’02), vol. 4623, pp. 77–84 (2002)
Hebert, P.D.N., Cywinska, A., Ball, S.L., DeWaard, J.R.: Biological identifications through DNA barcodes. In: Proceedings of the Royal Society of London. Series B: Biological Sciences, vol. 270, no. 1512, pp. 313–321 (2003)
Rosen, G.L.: Biologically-inspired gradient source localization and DNA sequence analysis. Georg. Inst. Technol., August, 2006
Chakravarthy, N., Spanias, A., Iasemidis, L.D., Tsakalis, K.: Autoregressive modeling and feature analysis of DNA sequences. EURASIP J. Appl. Signal Process. 1, 13–28 (2004)
Rosen, G.L., Moore, J.D.: Investigation of coding structure in DNA. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’03), 6 April 2003
Cristea, P.D.: Conversion of nucleotides sequences into genomic signals. J. Cell. Mol. Med. 6(2), 279–303 (2002)
Lucal, H.M.: Arithmetic operations for digital computers using a modified reflected binary code. IRE Trans. Electron. Comput. EC-8(4), 449–458 (1959)
HRM195 and ASP67dataset. http://www.vision.ime.usp.br/jmena/MGWT/datasets/2010
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Raman Kumar, M., Naveen Kumar, V. (2020). A Numerical Representation Method for a DNA Sequence Using Gray Code Method. In: Das, K., Bansal, J., Deep, K., Nagar, A., Pathipooranam, P., Naidu, R. (eds) Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, vol 1057. Springer, Singapore. https://doi.org/10.1007/978-981-15-0184-5_55
Download citation
DOI: https://doi.org/10.1007/978-981-15-0184-5_55
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0183-8
Online ISBN: 978-981-15-0184-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)