Abstract
Mathematical analysis of large-volume genomic DNA sequence data is one of the challenges for biologists. Graphical representation of DNA or protein sequences provides a simple way of viewing, sorting, and comparing sequence similarity. In this chapter, we introduce two directions to construct graphical representation for biological sequences. The first direction is by curves without degeneracy and the second one is by Chaos Game Representation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
S.S.-T. Yau, J. Wang, A. Niknejad, C. Lu, N. Jin, Y. Ho, DNA sequence representation without degeneracy, Nucleic Acids Research, 31: 3078–3080, 2003.
T. Hoang, C. Yin, S.S.-T. Yau, Numerical encoding of DNA sequences by chaos game representation with application in similarity comparison, Genomics, 108: 134–142, 2016.
F. Sievers, D.G. Higgins, Clustal Omega for making accurate alignments of many protein sequences, Protein Sci, 27: 135–145, 2018.
E. Hamori, J. Ruskin, H curves, a novel method of representation of nucleotide series especially suited for long DNA sequences, Journal of Biological Chemistry, 258: 1318–1327, 1983.
M.A. Gates, Simpler DNA sequence representations, Nature, 316: 219, 1985.
L. Liu, Y. Ho, S.S.-T. Yau, Clustering DNA sequences by feature vectors, Molecular Phylogenetics and Evolution, 41: 64–69, 2006.
C. Yu, Q. Liang, C. Yin, R. He, S.S.-T. Yau, A novel construction of genome space with biological geometry, DNA Research, 17: 155–168, 2010.
C. Yu, M. Deng, S.S.-T. Yau, DNA sequence comparison by a novel probabilistic method, Information Sciences, 181: 1484–1492, 2011.
C.M. Cover, J.A. Thomas, Elements of information theory, John Wiley and Sons, NY, 1991.
R.R. Sokal and C.D. Michener, A statistical method for evaluating systematic relationships, University of Kansas science bulletin, 38: 1409–1438, 1958.
S.S.-T. Yau, C. Yu, R. He, A protein map and its application, DNA and Cell Biology, 27: 241–250, 2008.
J. Fauchere, V. Pliska, Hydrophobic parameters of amino-acid side-chains from the partitioning of N-acetyl-amino acid amides, European Journal of Medicinal Chemistry, 18: 369–375, 1983.
C. Yu, S.Y. Cheng, R. He, S.S.-T. Yau, Protein map: An alignment-free sequence comparison method based on various properties of amino acids, Gene, 486: 110–118, 2011.
X. Xia, W.H. Li, What amino acid properties affect protein evolution? Journal of Molecular Evolution, 47: 557–564, 1998.
P.H.A. Sneath, Relations between chemical structure and biological activity, Journal of Theoretical Biology, 12: 157–195, 1966.
K. Tian, X. Yang, Q. Kong, C. Yin, R. He, S.S.-T. Yau, Two dimensional Yau-Hausdorff distance with applications on comparison of DNA and protein sequences, PLoS ONE, 10: e0136577, 2015.
D.P. Huttenlocher, G.A. Klanderman, W.J. Rucklidge, Comparing images using the Hausdorff distance, IEEE Transactions on Pattern Analysis and Machine Intelligence, 15: 850–863, 1993.
D.P. Huttenlocher, K. Kedem, J.M. Kleinberg, On dynamic Voronoi diagrams and the minimum Hausdorff distance for point sets under Euclidean motion in the plane, Proceedings of the eighth annual symposium on Computational geometry, 110–119, 1992.
L.P. Chew, M.T. Goodrich, D.P. Huttenlocher, K. Kedem, J.M. Kleinberg, D. Kravets, Geometric pattern matching under Euclidean motion, Computational Geometry, 7: 113–124, 1997.
G. Rote, Computing the minimum Hausdorff distance between two point sets on a line under translation, Information Processing Letters, 38: 123–127, 1991.
B. Li, Y. Shen, B. Li, A new algorithm for computing the minimum Hausdorff distance between two point sets on a line under translation, Information Processing Letters, 106: 52–58, 2008.
P.D. Hebert, A. Cywinska, S.L. Ball, J.R. deWaard, Biological identifications through DNA barcodes, Proc. Biol. Sci., 270: 313–321, 2003.
Jeffrey, H. Joel, Chaos game representation of gene structure, Nucleic Acids Research, 18: 2163–2170, 1990.
T. Hoang, C. Yin, S.S.-T. Yau, Splice sites detection using chaos game representation and neural network, Genomics, 112: 1847–1852, 2020.
A. Fiser, G. E. Tusnády, I. Simon, Chaos game representation of protein structures, Journal of Molecular Graphics, 12: 302–304, 1994.
Z. Yu, V. Anh, K. Lau, Chaos game representation of protein sequences based on the detailed HP model and their multifractal and correlation analyses, Journal of Theoretical Biology, 226: 341–348, 2004.
Z. Sun, S. Pei, R. He, S.S.-T. Yau, A novel numerical representation for proteins: Three-dimensional Chaos Game Representation and its Extended Natural Vector, Computational and Structural Biotechnology Journal, 18: 1904–1913, 2020.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Yau, S.ST., Zhao, X., Tian, K., Yu, H. (2023). Graphical Representation of Sequences and Its Application. In: Mathematical Principles in Bioinformatics. Interdisciplinary Applied Mathematics, vol 58. Springer, Cham. https://doi.org/10.1007/978-3-031-48295-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-48295-3_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48294-6
Online ISBN: 978-3-031-48295-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)