Dissimilar Symmetric Word Pairs in the Human Genome

  • Ana Helena Tavares
  • Jakob Raymaekers
  • Peter J. Rousseeuw
  • Raquel M. Silva
  • Carlos A. C. Bastos
  • Armando Pinho
  • Paula Brito
  • Vera Afreixo
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 616)

Abstract

In this work we explore the dissimilarity between symmetric word pairs, by comparing the inter-word distance distribution of a word to that of its reversed complement. We propose a new measure of dissimilarity between such distributions. Since symmetric pairs with different patterns could point to evolutionary features, we search for the pairs with the most dissimilar behaviour. We focus our study on the complete human genome and its repeat-masked version.

Keywords

Inter-word distance Reversed complements Dissimilarity measure Human genome 

References

  1. 1.
    Afreixo, V., Bastos, C.A.C., Garcia, S.P., Rodrigues, J.M.O.S., Pinho, A.J., Ferreria, P.J.S.G.: The breakdown of the word symmetry in the human genome. J. Theoret. Biol. 335, 153–159 (2013)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Afreixo, V., Rodrigues, J.M.O.S., Bastos, C.A.C.: Analysis of single-strand exceptional word symmetry in the human genome: new measures. Biostatistics 16(2), 209–221 (2015)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Albrecht-Buehler, G.: Asymptotically increasing compliance of genomes with Chargaff’s second parity rules through inversions and inverted transpositions. Proc. Natl. Acad. Sci. 103(47), 17828–17833 (2006)CrossRefGoogle Scholar
  4. 4.
    Baisnée, P.-F., Hampson, S., Baldi, P.: Why are complementary DNA strands symmetric? Bioinformatics 18(8), 1021–1033 (2002)CrossRefGoogle Scholar
  5. 5.
    Benson, G., et al.: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27(2), 573–580 (1999)CrossRefGoogle Scholar
  6. 6.
    Forsdyke, D.R., Mortimer, J.R.: Chargaff’s legacy. Gene 261(1), 127–137 (2000)CrossRefGoogle Scholar
  7. 7.
    Karolchik, D., Hinrichs, A.S., Furey, T.S., Roskin, K.M., Sugnet, C.W., Haussler, D., Kent, W.J.: The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 32(suppl 1), D493–D496 (2004)CrossRefGoogle Scholar
  8. 8.
    Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al.: Initial sequencing and analysis of the human genome. Nature 409(6822), 860–921 (2001)CrossRefGoogle Scholar
  9. 9.
    Smit, A.F.A., Hubley, R.M., Green, P.: Repeatmasker open-4.0. 2013–2015 (http://repeatmasker.org)
  10. 10.
    Tavares, A.H., Afreixo, V., Rodrigues, J.M.O.S., Bastos, C.A.C.: The symmetry of oligonucleotide distance distributions in the human genome. In: Proceedings of ICPRAM, vol. 2, pp. 256–263 (2015)Google Scholar
  11. 11.
    Zhang, S.-H., Huang, Y.-Z.: Strand symmetry: characteristics and origins. In: 2010 4th International Conference on Bioinformatics and Biomedical Engineering (iCBBE), pp. 1–4. IEEE (2010)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Ana Helena Tavares
    • 1
  • Jakob Raymaekers
    • 2
  • Peter J. Rousseeuw
    • 2
  • Raquel M. Silva
    • 3
    • 4
  • Carlos A. C. Bastos
    • 4
    • 5
  • Armando Pinho
    • 4
    • 5
  • Paula Brito
    • 6
  • Vera Afreixo
    • 1
    • 3
    • 4
  1. 1.Department of Mathematics and CIDMAUniversity of AveiroAveiroPortugal
  2. 2.Department of MathematicsKU LeuvenLeuvenBelgium
  3. 3.Department of Medical Sciences and iBiMEDUniversity of AveiroAveiroPortugal
  4. 4.IEETAUniversity of AveiroAveiroPortugal
  5. 5.DETIUniversity of AveiroAveiroPortugal
  6. 6.FEP and LIAAD - INESC TECUniversity of PortoPortoPortugal

Personalised recommendations