Decoding Genomic Information
Genomes carry the main information generating life of organisms and their evolution. They work in nature as a marvellous operative system of molecular (reading, writing and signal transmission) rules, orchestrating all cell functions and information transmission to cell daughters. As long polymers of nucleotides, they may be seen as a special book which reports in its own sequence all developments it had passed through during evolution. All fragments which were mutated, duplicated, assembled, silenced are still present in the genomic sequence to some extent, to form genomic dictionaries.
Here we outline some trends of research which analyse and interpret (i.e., decode) genomic information, by assuming the genome to be a book encrypted in an unknown language, which has still to be deciphered, while directly affecting the structure and the interaction of all the cellular and multicellular components. We focus on an informational analysis of real genomes, which may be framed within a new trend of computational genomics, lying across bioinformatics and natural computing. This analysis is performed by sequence alignment-free methods, based on information theoretical concepts, in order to convert the genomic information into a comprehensible mathematical form and to understand its complexity.
After a nutshell of the state of the art, given as a brief overview of approaches in the area, we present our viewpoint and results on genomic wide studies, by means of mathematical distributions and dictionary-based analysis inspired by information theory, where normalized multiplicities of genomic words are frequencies defining discrete probability distributions of interest. The definition, computation, and analysis of a few informational indexes have highlighted some properties of genomic regularity and specificity, which may be a basis for the comprehension of evolutional and functional aspects of genomes.
Unable to display preview. Download preview PDF.