We developed a methodology to process DNA sequences based on the inter-dinucleotide distances and we characterized the inter-dinucleotide distance distributions of the human genome. The distance distribution of each dinucleotide was compared to the distance distribution of all the other dinucleotides using the Kullback-Leibler divergence. We found out that the divergence between the distribution of the distances of a dinucleotide and that of its reversed complement is very small, indicating that these distance distributions are very similar. This is an interesting finding that might give evidence of a stronger parity rule than the one provided by Chargaff’s second parity rule. Furthermore, we also compared the distance distribution of each dinucleotide to a reference distribution, that of a random sequence generated with the same dinucleotide abundances, revealing the CG dinucleotide as the one with the highest cumulative relative error for the first 60 distances.
- Distance Distribution
- Reference Distribution
- Parity Rule
- Reversed Complement
- Relative Frequency Distribution
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Unable to display preview. Download preview PDF.
Afreixo, V., Bastos, C.A.C., Pinho, A.J., Garcia, S.P., Ferreira, P.J.S.G.: Genome analysis with inter-nucleotide distances. Bioinformatics 25(23), 3064–3070 (2009)
Albrecht-Buehler, G.: Asymptotically increasing compliance of genomes with Chargaff’s second parity rules through inversions and inverted transpositions. Proceedings of the National Academy of Sciences of the United States of America 103(47), 17828–17833 (2006)
Albrecht-Buehler, G.: Inversions and inverted transpositions as the basis for an almost universal “format” of genome sequences. Genomics 90, 297–305 (2007)
Burge, C., Campbell, A.M., Karlin, S.: Over- and under-representation of short oligonucleotides. Proc. Nat. Acad. Sci. USA 89, 1358–1362 (1992)
Gentles, A.J., Karlin, S.: Genome-scale compositional comparisons in eukaryotes. Genome Research 11, 540–546 (2001)
Glass, J.L., Thompson, R.F., Khulan, B., Figueroa, M.E., Olivier, E.N., Oakley, E.J., Van Zant, G., Bouhassira, E.E., Melnick, A., Golden, A., Fazzari, M.J., Greally, J.M.: CG dinucleotide clustering is a species-specific property of the genome. Nucleic Acids Research 35(20), 6798–6807 (2007)
Qi, D., Jamie Cuticchia, A.: Compositional symmetries in complete genomes. Bioinformatics 17(6), 557–559 (2001)
Qi, J., Wang, B., Hao, B.-I.: Whole proteome prokaryote phylogeny without sequence alignment: A K-string composition approach. Journal of Molecular Evolution 58, 1–11 (2004)
Editors and Affiliations
Rights and permissions
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bastos, C.A.C., Afreixo, V., Pinho, A.J., Garcia, S.P., Rodrigues, J.M.O.S., Ferreira, P.J.S.G. (2011). Distances between Dinucleotides in the Human Genome. In: Rocha, M.P., Rodríguez, J.M.C., Fdez-Riverola, F., Valencia, A. (eds) 5th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2011). Advances in Intelligent and Soft Computing, vol 93. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19914-1_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19913-4
Online ISBN: 978-3-642-19914-1
eBook Packages: EngineeringEngineering (R0)