De Novo Assembly and Cluster Analysis of Siberian Larch Transcriptome and Genome

  • Michael SadovskyEmail author
  • Yulia Putintseva
  • Vladislav Birukov
  • Serafima Novikova
  • Konstantin Krutovsky
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9656)


We studied Siberian Larch (Larix Sibirica) transcriptome making de novo assembly and cluster analysis of contigs frequency dictionaries. Also, some preliminary results of similar study of the larch genome are present. It was found that the larch transcriptome yields a number of unexpected symmetries in the statistical and combinatorial properties of the entities.


Frequency Triplet Order Cluster Elastic map Evolution 


  1. 1.
    Bugaenko, N.N., Gorban, A.N., Sadovsky, M.G.: Towards the definition of information content of nucleotide sequences. Mol. Biol. 30(5), 529–541 (1996)Google Scholar
  2. 2.
    Bugaenko, N.N., Gorban, A.N., Sadovsky, M.G.: The information capacity of nucleotide sequences and their fragments. Biophysics 5, 1063–1069 (1997)Google Scholar
  3. 3.
    Bugaenko, N.N., Gorban, A.N., Sadovsky, M.G.: Maximum entropy method in analysis of genetic text and measurement of its information content. Open Syst. Inf. Dyn. 5(2), 265–278 (1998)CrossRefzbMATHGoogle Scholar
  4. 4.
    Grebnev Ya, V., Sadovsky, M.G.: Chargaff’s second rule and symmetry in genomes. Fundam. Stud. 12(5), 958–965 (2014)Google Scholar
  5. 5.
    Hu, R., Wang, B.: Statistically significant strings are related to regulatory elements in the promoter regions of Saccharomyces cerevisiae. Physica A 290, 464–474 (2001)CrossRefzbMATHGoogle Scholar
  6. 6.
    Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)CrossRefGoogle Scholar
  7. 7.
    Oreshkova, N.V., Putintseva, Yu.A., Kuzmin, D.A., Sharov, V.V., Biryukov, V.V., Makolov, S.V., Deych, K.O., Ibe, A.A., Shilkina, E.A., Krutovsky, K.V.: Genome sequencing and assembly of Siberian larch (Larix sibirica Ledeb.) and Siberian pine (Pinus sibirica Du Tour) and prelimenary transcriptome data. In: Proceedings of the 4th International Conference on Conservation of Forest Genetic Resources in Siberia, Barnaul, Russia, 24–29 August 2015, pp. 127–128 (2015)Google Scholar
  8. 8.
    Qu, H., Wu, H., Zhang, T., Zhang, Z., Hu, S., Yu, J.: Nucleotide compositional asymmetry between the leading and lagging strands of eubacterial genomes. Res. Microbiol. 161, 838–846 (2010)CrossRefGoogle Scholar
  9. 9.
    Tsiligaridis, J.: Multiple sequence alignment, clustering with dot matrices, entropy, genetic algorithms. In: Li, K.-C., Jiang, H., Yang, L.T., Cuzzocreapp, A. (eds.) Big Data: Algorithms, Analytics, and Application, Chap. 4, pp. 71–88. CRC Press (2015)Google Scholar
  10. 10.
    Znamenskij, S.V.: Modeling of the optimal sequence alignment problem. Program Syst. Theor. Appl. 4(22), 257–267 (2014). In RussianGoogle Scholar
  11. 11.
    Znamenskij, S.V.: A model and algorithm for sequence alignment. Program Syst. Theor. Appl. 1(24), 189–197 (2015)Google Scholar
  12. 12.
    Krutovsky, K.V., Oreshkova, N.V., Putintseva, Y., Ibe, A.A., Deutsch, K.O., Shilkina, E.A.: Some preliminary results of a full genome de novo sequencing of Larix sibirica Ledeb., Pinus sibirica Du Tour. Siberian For. J. 1(4), 79–83 (2014). (in Russian, English abstract)Google Scholar
  13. 13.
    Gorban, A.N., Zinovyev, A.Y., Popova, T.G.: Seven clusters in genomic triplet distributions. Silico Biol. 3, 39–45 (2003)Google Scholar
  14. 14.
    Gorban, A.N., Zinovyev, A.Y., Popova, T.G.: Four basic symmetry types in the universal 7-cluster structure of microbial genomic sequences. Silico Biol. 5, 25–37 (2005)Google Scholar
  15. 15.
    Gorban, A.N., Zinovyev, A., Popova, T.G.: Universal seven-cluster structure of genome fragment distribution: basic symmetry in triplet frequencies. In: Kolchanov, N., Hofestaedt, R. (eds.) Bioinformatics of Genome Regulation and Structure II, pp. 153–163. Springer Science+Business Media Inc., New York (2005)Google Scholar
  16. 16.
    Gorban, A.N., Zinovyev, A.Y.: Principal manifolds and graphs in practice: from molecular biology to dynamical systems. Int. J. of Neural Syst. 20, 219 (2010)CrossRefGoogle Scholar
  17. 17.
    Gorban, A.N., Kögl, B., Wünsch, D.C., Zinovyev, A.Yu. (eds.): Principal Manifolds for Data Visualisation and Dimension Reduction. Lecture Notes in Computational Science and Engineering, vol. 58, 332 p. Springer, Heidelberg (2007)Google Scholar
  18. 18.
    Gorban, A.N., Zinovyev, A.Yu.: Principal graphs and manifolds. In: Olivas, E.S., et al. (eds.) Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods and Techniques, pp. 28–59. Information Science Reference, IGI Global, Hershey (2009)Google Scholar
  19. 19.
    Fukunaga, K.: Introduction to Statistical Pattern Recognition, 2 edn., 591 p. Academic Press, London (1990)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Michael Sadovsky
    • 1
    Email author
  • Yulia Putintseva
    • 1
  • Vladislav Birukov
    • 1
  • Serafima Novikova
    • 1
  • Konstantin Krutovsky
    • 1
  1. 1.Institute of Computational Modelling of SB RASKrasnoyarskRussia

Personalised recommendations