Mutagenicity Analysis Based on Rough Set Theory and Formal Concept Analysis
Most of the current Machine Learning applications in cheminformatics are black box applications. Support vector machine and neural networks are the most used classification techniques in prediction of the mutagenic activity of compounds. The problem of these techniques is that the rules/reasons of prediction are unknown. The rules could show the most important features/descrpitors of the compounds and the relations among them. This article proposes a model for generating the rules that governs prediction through the rough set theory. These rules, which based on two levels of selection for the highly discriminating power features, are visualized by lattice generated using the formal concept analysis approach. That is, better understanding of the reasons that leads to the mutagenic activity can be obtained. The resulted lattice shows that lipophilicity, number of nitrogen atoms, and electronegativity are the most important parameters in mutagenicity detection. Moreover, experimental results are compared against previous researches for validating the proposed model.
KeywordsSupport Vector Machine Formal Concept Target Class Formal Concept Analysis Feature Selection Technique
Unable to display preview. Download preview PDF.
- 1.Brown, N.: Chemoinformatics: an introduction for computer scientists. ACM Computing Surveys 41(2), Article 8 (2009)Google Scholar
- 5.Salama, M.A., El-Bendary, N., Hassanien, A.E., Revett, K., Fahmy, A.A.: Interval based attribute evaluation algorithm. In: Proc. The Federated Conference on Computer Science and Information Systems, FedCSIS 2011, Szczecin, Poland, pp. 153–156 (2011)Google Scholar
- 6.Thabtah, F., Eljinini, M., Zamzeer, M., Hadi, W.: Naieve Bayesian based on Chi Square to Categorize Arabic Data. In: Proc. The 11th International Business Information Management Association Conference, IBIMA, on Innovation and Knowledge Management in Twin Track Economies, Cairo, Egypt, pp. 930–935 (2009)Google Scholar
- 7.Eid, H.F., Salama, M.A., Hassanien, A.E., Kim, T.-H.: Bi-Layer Behavioral-Based Feature Selection Approach for Network Intrusion Classification. In: Kim, T.-H., Adeli, H., Fang, W.-C., Garca-Villalba, L.J., Arnett, K.P., Khan, M.K. (eds.) Proc. Security Technology - International Conference, SecTech 2011, Part of the Future Generation Information Technology Conference, FGIT 2011, Jeju Island, Korea, pp. 195–203 (2011)Google Scholar
- 11.ChemAxon Software, http://www.chemaxon.com/ (last accessed: January 2013)
- 13.WEKA: Waikato Environment for Knowledge Analysis, version 3.5.9, http://www.cs.waikato.ac.nz/ml/weka/(last accessed: January 2013)