Nucleosome positioning based on generalized relative entropy
- 81 Downloads
Nucleosome positioning played significant roles in various biological processes. With the development of high-throughput techniques, many methods and software were developed for nucleosome positioning. Although results with high accuracy (Acc) were obtained, the key factors for determining nucleosome positioning under less time complexity remain unresolved. Therefore, combining generalized relative entropy with self-similarity of DNA sequences, a novel method of nucleosome positioning was proposed for predicting nucleosome positioning in human, worm, fly and yeast genomes, respectively. Experimental results showed that prediction Acc of nucleosome positioning in aforementioned datasets reached 87.78%, 87.98%, 83.36% and 100%, respectively. Furthermore, it was found that five-nucleotide and six-nucleotide sequences were the determinant factors in nucleosome positioning.
KeywordsNucleosome positioning Generalized relative entropy Random forest Support vector machines
This research is funded by National Natural Science Foundation of China project with Grant No. 61502254, Program for Yong Talents of Science and Technology in Universities of Inner Mongolia Autonomous Region with Grant No. NJYT-18-B10, and Open Funds of Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education with Grant No. 93K172018K07.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
- Awazu A (2017) Prediction of nucleosome positioning by the incorporation of frequencies and distributions of three different nucleotide segment lengths into a general pseudo k-tuple nucleotide composition. Bioinformatics 33(1):42–48. https://doi.org/10.1093/bioinformatics/btw562 MathSciNetCrossRefGoogle Scholar
- Benson G (2002) A new distance measure for comparing sequence profiles based on path lengths along an entropy surface. Bioinformatics 18(suppl_2):S44–S53. https://doi.org/10.1093/bioinformatics/18.suppl_2.s44 CrossRefGoogle Scholar
- Ide H, Umezawa M, Ohwada H (2016) Function prediction of disease-related long intergenic non-coding rna using random forest. In: Proceedings of the 7th international conference on computational systems-biology and bioinformatics. https://doi.org/10.1145/3029375.3029384
- Zhang J, Hadj-Moussa H, Storey KB (2016) Current progress of high-throughput microRNA differential expression analysis and random forest gene selection for model and non-model systems: an R implementation. J Integr Bioinformatics 13(5):35–46. https://doi.org/10.1515/jib-2016-306 CrossRefGoogle Scholar
- Zhang J, Peng W, Wang L (2018a) LeNup: learning nucleosome positioning from DNA sequences with improved convolutional neural networks. Bioinformatics 34(10):1705–1712. https://doi.org/10.1093/bioinformatics/bty003/4796955 CrossRefGoogle Scholar