Abstract
In this paper several methods of grammar induction problem are examined in the context of biological sequence analysis. In addition to this, a new method which generates noncircular context-free grammars is proposed. It has been shown through a computational experiment that the proposed, evolutionary-inspired approach overcomes statistically—with respect to classification quality—other grammatical inference algorithms on the sequences from a real amyloidogenic dataset.
This research was supported by National Science Center, grant 2016/21/B/ST6/02158.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Asgari, E., Mofrad, M.R.K.: Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS ONE 10(11), e0141287 (2015). https://doi.org/10.1371/journal.pone.0141287
Banzhaf, W., Francone, F.D., Keller, R.E., Nordin, P.: Genetic Programming: An Introduction: On the Automatic Evolution of Computer Programs and Its Applications. Morgan Kaufmann, San Francisco (1998)
Chirathamjaree, C., Ackroyd, M.H.: A method for the inference of non-recursive context-free grammars. Int. J. Man Mach. Stud. 12(4), 379–387 (1980)
Bouckaert, R.R., Frank, E.: Evaluating the replicability of significance tests for comparing learning algorithms. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS, vol. 3056, pp. 3–12. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24775-3_3
Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, New York (1998)
Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979)
Hu, X., Pan, Y.: Knowledge Discovery in Bioinformatics: Techniques, Methods, and Applications. Wiley, New Jersey (2007)
Keedwell, E., Narayanan, A.: Intelligent Bioinformatics: The Application of Artificial Intelligence Techniques to Bioinformatics Problems. Wiley, Chichester (2005)
Langdon, W.B., Barrett, S.J.: Genetic programming in data mining for drug discovery. In: Ghosh, A., Jain, L.C. (eds.) Evolutionary Computing in Data Mining, vol. 163, pp. 211–235. Springer, Heidelberg (2005). https://doi.org/10.1007/3-540-32358-9_10
Wieczorek, W., Unold, O.: Induction of directed acyclic word graph in a bioinformatics task. In: JMLR Workshop and Conference Proceedings, vol. 34, pp. 207–217 (2014)
Wieczorek, W., Unold, O.: Use of a novel grammatical inference approach in classification of amyloidogenic hexapeptides. Comput. Math. Methods Med. 2016 (2016). Article ID 1782732
Wieczorek, W.: Grammatical Inference: Algorithms, Routines and Applications. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-46801-3
Wozniak, P.P., Kotulska, M.: AmyLoad: website dedicated to amyloidogenic protein fragments. Bioinformatics 31(20), 3395–3397 (2015)
Wu, T.-F., Lin, C.-J., Weng, R.C.: Probability estimates for multi-class classification by pairwise coupling. J. Mach. Learn. Res. 5, 975–1005 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wieczorek, W., Unold, O. (2019). GP-Based Grammatical Inference for Classification of Amyloidogenic Sequences. In: Bartoletti, M., et al. Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2017. Lecture Notes in Computer Science(), vol 10834. Springer, Cham. https://doi.org/10.1007/978-3-030-14160-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-14160-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14159-2
Online ISBN: 978-3-030-14160-8
eBook Packages: Computer ScienceComputer Science (R0)