Statistically Significant Discriminative Patterns Searching
In this paper, we propose a novel algorithm, named SSDPS, to discover patterns in two-class datasets. The SSDPS algorithm owes its efficiency to an original enumeration strategy of the patterns, which allows to exploit some degrees of anti-monotonicity on the measures of discriminance and statistical significance. Experimental results demonstrate that the performance of the SSDPS algorithm is better than others. In addition, the number of generated patterns is much less than the number of the other algorithms. Experiment on real data also shows that SSDPS efficiently detects multiple SNPs combinations in genetic data.
KeywordsDiscriminative patterns Discriminative measures Statistical significance Anti-monotonicity
- 1.Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: Fifth ACM SIGKDD, KDD 1999, pp. 43–52. ACM, New York (1999)Google Scholar
- 3.Cheng, H., Yan, X., Han, J., Yu, P.S.: Direct discriminative pattern mining for effective classification. In: ICDE 2008, pp. 169–178. IEEE Computer Society, Washington, DC (2008) Google Scholar
- 10.Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Workshop Frequent Item Set Mining Implementations (2004)Google Scholar