Improvement of FP-Growth Algorithm for Mining Description-Oriented Rules
In the paper new modification of the rules induction method for description of gene groups using Gene Ontology based on FP-growth algorithm is proposed. The modification takes advantage of the hierarchical structure of GO graph, specific property of a single prefix-path FP tree and the fact that if we generate rules for description purposes we do not include into rule premise two GO terms that are in parent-children relation. The proposed algorithms was implemented and tested with two different expression datasets. Time performance of old and new approach is compared together with descriptions obtained with two methods. The results show that the new method allows generating rules faster, while the number of rules and coverage is similar in both approaches.
Keywordsrules induction FP-growth Gene Ontology time performance functional description
Unable to display preview. Download preview PDF.
- 1.Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proceedings of 20th International Conference on Very Large Data Bases (VLDB 1994), pp. 487–499. Morgan Kaufmann Publishers Inc. (1994)Google Scholar
- 4.Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 95(25), 14,863–14,868 (1998)Google Scholar
- 5.Gruca, A., Sikora, M., Polański, A.: RuleGO: a logical rules-based tool for description of gene groups by means of gene ontology. Nucleic Acids Research 39(suppl. 2), W293–W301 (2011)Google Scholar