Formal Language Representation and Modelling Structures Underlying RNA Folding Process

  • Anand MahendranEmail author
  • Lakshmanan Kuppusamy
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10398)


The biological sequences that occur in DNA, RNA and proteins can be considered as strings formed over the well defined chemical alphabets. Such gene sequences form structure based on the complementary pair and the structures can be interpreted as languages. Matrix insertion-deletion system has been introduced a few years back that modelled several bio-molecular structures occur at intramolecular and intermolecular level. In this paper, we identify some structures that are frequently noticed during RNA folding process such as double bulge loop, extended internal loop, triple stem and loop and we give the corresponding formal language representation to such structures. Further, we model the structures using Matrix insertion-deletion systems. This work is pioneering to give the language representation and modelling the structures of RNA folding process using formal grammar.


Gene sequences Bio-molecular structures Matrix grammars Insertion-deletion systems Folding process 


  1. 1.
    Brendel, V., Busse, H.G.: Genome structure described by formal languages. Nucleic Acids Res. 12(5), 2561–2568 (1984)CrossRefGoogle Scholar
  2. 2.
    Brown, M., Wilson, C.: RNA Pseudoknot modelling using intersections of stochastic CFG with applications to database search. In: Proceedings of the Pacific Symposium on Biocomputing, Hawaii, USA, pp. 109–125 (1995)Google Scholar
  3. 3.
    Cai, L., Russell, L., Wu, Y.: Stochastic modelling of RNA pseudoknotted structures: a grammatical approach. Bioinformatics 19(1), 66–73 (2003)CrossRefGoogle Scholar
  4. 4.
    Galiukschov, B.S.: Semicontextual grammars (in Russian). Matem. Logica i Matem. Lingvistika, pp. 38–50 (1981)Google Scholar
  5. 5.
    Head, T.: Formal language theory and DNA: an analysis of the generative capacity of specific recombinant behaviors. Bull. Math. Biol. 49(6), 737–750 (1987)CrossRefzbMATHMathSciNetGoogle Scholar
  6. 6.
    Kari, L., Thierrin, G.: Contextual insertions/deletions and computability. Inf. Comput. 131(1), 47–61 (1996)CrossRefzbMATHMathSciNetGoogle Scholar
  7. 7.
    Kari, L.: On insertion and deletion in formal languages. Ph.D. Thesis, University of Turku (1991)Google Scholar
  8. 8.
    Kuppusamy, L., Mahendran, A., Krishna, S.N.: Matrix insertion-deletion systems for bio-molecular structures. In: Natarajan, R., Ojo, A. (eds.) ICDCIT 2011. LNCS, vol. 6536, pp. 301–312. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-19056-8_23 CrossRefGoogle Scholar
  9. 9.
    Lakshmanan, K., Anand, M., Clergerie, E.V.: Modelling intermolecular structures and defining ambiguity in gene sequences using matrix insertion-deletion systems. In: Enguix, G.B., Dahl, V., Dolores Jimenez Lopez, M. (eds.) Biology, Computation and Linguistics, New Interdisciplinary Paradigms, pp. 71–85. IOS Press, Amsterdam (2011)Google Scholar
  10. 10.
    Mamitsuka, H., Abe, N.: Prediction of beta-sheet structures using stochastic tree grammars. In: Proceedings of Fifth Workshop on Genome Informatics, pp. 19–28. Universal Academy Press, Yokohama (1994)Google Scholar
  11. 11.
    Pan, T., Sosnick, T.: RNA folding during transcription. Annu. Rev. Biophys. Biomol. Struct. 35, 161–175 (2006)CrossRefGoogle Scholar
  12. 12.
    Petre, I., Verlan, S.: Matrix insertion-deletion systems. Theoret. Comput. Sci. 456, 80–88 (2012)CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    Rivas, E., Eddy, S.R.: The language of RNA: a formal grammar that includes pseudoknots. Bioinformatics 16(4), 334–340 (2000)CrossRefGoogle Scholar
  14. 14.
    Rozenberg, G., Salomaa, A.: Handbook of Formal Languages. Springer, New York (1996). doi: 10.1007/978-3-642-59126-6 zbMATHGoogle Scholar
  15. 15.
    Searls, D.B.: Representing genetic information with formal grammars. In: Proceedings of the National Conference on Artificial Intelligence, Saint Paul, Minnesota, pp. 386–391 (1988)Google Scholar
  16. 16.
    Searls, D.B.: The computational linguistics of biological sequences. In: Hunter, L. (ed.) Artificial Intelligence and Molecular Biology, pp. 47–120. AAAI Press, Menlo Park (1993)Google Scholar
  17. 17.
    Searls, D.B.: Formal grammars for intermolecular structures. In: First International IEEE Symposium on Intelligence and Biological Systems, Washington, USA, pp. 30–37 (1995)Google Scholar
  18. 18.
    Searls, D.B.: The language of genes. Nature 420(6912), 211–217 (2002)CrossRefGoogle Scholar
  19. 19.
    Uemura, Y., Hasegawa, A., Kobayashi, S., Yokomori, T.: TAG for RNA structure prediction. Theoret. Comput. Sci. 210(2), 277–303 (1999)CrossRefzbMATHMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringVIT UniversityVelloreIndia

Personalised recommendations