Abstract
Nowadays, genetic disorders, like cancer and birth defects, are a great threat to human life. Since the first noticing of these types of diseases, many efforts have been made and researches performed in order to recognize them and find a cure for them. These disorders affect genes and they appear as abnormal traits in a genetic organism. In order to recognize abnormal genes, we need to predict splice sites in a DNA signal; then, we can process the genetic codes between two continuous splice sites and analyze the trait that it represents. In addition to abnormal genes and their consequent disorders, we can also identify other normal human traits like physical and mental features. So the primary issue here is to estimate splice sites precisely. In this paper, we have introduced two new methods in using neuro-fuzzy network and clustering for DNA splice site prediction. In this method, instead of using raw data and nucleotide sequence as an input to neural network, a survey on the first bunch of the nucleotide sequence of true and false categories of the input is carried out and training of the neuro-fuzzy network is achieved based on the similarities and dissimilarities of the selected sequences. In addition, sequences of the large input data are clustered into smaller categories to improve the prediction as they are really spliced based on different mechanisms. Experimental results show that these improvements have increased the recognition rate of the splice sites.
Similar content being viewed by others
References
Churbanov A, Ali H (2005) Combinatorial method of splice site prediction. Proceedings of the 2005 IEEE computational systems bioinformatics conference, pp 189–190
Awadalla S, Ortiz JE, Gopal S (2005) Prediction of trans-splicing sites using a genetic algorithm. Res Comput Mol Biol pp 1–2
Watanabe T, Kudo Y, Shimizu T (2002) Positional correlations in splicing patterns and its application to prediction of splice sites. Genome Inform 13:426–427
Kashiwabara AY, Vieira DCG, Machado-Lima A, Durham AM (2007) Splice site prediction using stochastic regular grammars. Genetic Mol Res 6:105–115
Hebsgaard SM, Korning PG, Tolstrup N, Engelbrecht J, Rouzél P, Brunak S (1996) Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res 24:3439–3452
Islamaj R, Getoor L, Wilbur WJ (2007) A feature generation algorithm for sequences with application to splice-site prediction. BMC Bioinform 8:410–425
Brent MR, Guigo R (2004) Recent advances in gene structure prediction. Curr Opin Struct Biol 14:264–272
Culotta A, Kulp D, McCallum A (2007) Gene prediction with conditional random fields. M.Sc. thesis, Department of Computer Science, University of Massachusetts Amherst, pp 75–77
Perti M, Lin X, Salzberg SL (2001) Gene Splicer: a new computational method for splice site prediction. Nucleic Acid Res 29:1185–1190
Xu S, Ma F, Tao L (2007) Learn from the information contained in the false splice sites as well as in the true splice sites using SVM. ISKE-2007 proceedings
Sonnenburg S, Schweikert G, Philips P, Ratsch G (2007) Accurate splice site prediction using SVMs. 15th annual international conference on intelligent systems for molecular biology (ISMB) proceedings
Tchourbanov A, Ali H (2004) Using enhancing signals to improve specificity of ab initio splice site sensors. Proceedings of the 2004 IEEE computational systems bioinformatics conference, pp 672–673
Raponi M, Baralle D (2008) Can donor splice site recognition occur without the involvement of U1 snRNP? Biochem Soc Trans 36:548–550
Bayer R, Davydov K (2005) Splice site prediction using multiple sequence alignment. Stanford University, Stanford
Zhang Y, Chu CH, Zha H, Chen Y, Ji X (2005) A probabilistic kernel for splice site prediction. Proceedings of the 8th joint conference on information sciences, pp 1278–1281
Ionescu M (2006) Prediction of alternatively spliced exons in C. Elegans using LSTM. Institute of Bioinformatics, Johannes Kepler University, Linz
Yamanaka T, Yada T, Nakai K (1999) An aberrant splicing database for finding rules of splice-site selection. Genome Inform pp 286–287
Patterson DJ, Yasuhara K, Ruzzo WL (2002) PRE-mRNA secondary structure prediction aids splice site prediction. Pacific Symposium of Biocomputing, pp 223–234
Segovia-Juareza JL, Colombanob S, Kirschnera D (2007) Identifying DNA splice sites using hyper networks with artificial molecular evolution. Biosystems 87:117–124
Sonnenburg S, Ratsch G, Jagota A, Muller KR (2002) New methods for splice site recognition. Master’s thesis
Reese MG, Eeckman FH (1998) Novel neural network prediction systems for human promoters and sites. Comput Gene Finding pp 4
Johansen O, Ryen T, Eftesol T, Kjosmoen T, Ruoff P (2008) Splice site prediction using artificial neural networks. Computational intelligence methods for bioinformatics and biostatistics, pp 102–113
Seiffert U, Hammer B, Kaski S, Villmann T (2006) Neural networks and machine learning in bioinformatics—theory and applications. 14th European symposium on artificial neural networks, pp 521–532
Matis S, Xu Y, Shah MB, Mural RJ, Uberbacher EC (1996) Gene identification and analysis: an application of neural network based information fusion. Foundations of decision/information fusion workshop on applications to engineering problems
Furutani H (1993) Analysis of abnormal splicing by neural network. Genome Inform 4:306–314
Ho LS, Rajapakse JC (2003) Splice site detection with a higher order Markov model implemented on a neural network. Genome Inform 14:64–72
Taher L, Morgenstern B, Meinicke P (2003) The importance of window length in splice site prediction. Proceedings of the European conference on computational biology, pp 469–470
http://hsc.utoledo.edu/bioinfo/eid. Accessed on 12 June 2012
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Moghimi, F., Manzuri Shalmani, M.T., Khaki Sedigh, A. et al. Two new methods for DNA splice site prediction based on neuro-fuzzy network and clustering. Neural Comput & Applic 23 (Suppl 1), 407–414 (2013). https://doi.org/10.1007/s00521-012-1257-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-012-1257-y