Mutagenicity Risk Analysis by Using Class Association Rules

  • Takashi Washio
  • Koutarou Nakanishi
  • Hiroshi Motoda
  • Takashi Okada
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4012)


Mutagenicity analysis of chemical compounds is crucial for the cause investigation of our modern diseases including cancers. For the analysis, accurate and comprehensive classification of the mutagenicity is strongly needed. Especially, use of appropriate features of the chemical compounds plays a key role for the interpretability of the classification results. In this paper, a classification approach named “Levelwise Subspace Clustering based Classification by Aggregating Emerging Patterns (LSC-CAEP)” which is known to be accurate and provides interpretable rules is applied to a mutagenicity data set. Promising results of the analysis are shown through a demonstration.


Association Rule Frequent Itemsets Dense Cluster Subspace Cluster Aggregate Score 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Debnath, A.K., et al.: Structure-Activity Relationship of Mutagenic Aromatic and Heteroaromatic Nitro compounds. J. Med. Chem. 34, 786–797 (1991)CrossRefGoogle Scholar
  2. 2.
    Klopman, G.: Artificial Intelligence Approach to Structure-Activity Studies. J. Amer. Chem. Soc. 106, 7315–7321 (1984)CrossRefGoogle Scholar
  3. 3.
    Washio, T., Nakanishi, K., Motoda, H.: Deriving Class Association Rules Based on Levelwise Subspace Clustering. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 692–700. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proc. of the 1998 ACM SIGMOD international conference on Management of data, pp. 94–105 (1998)Google Scholar
  5. 5.
    Procopiuc, C.M., Jones, M., Agarwal, P.K., Murali, T.M.: A Monte Carlo algorithm for fast projective clustering. In: Proc. of the 2002 ACM SIGMOD international conference on Management of data, pp. 418–427 (2002)Google Scholar
  6. 6.
    Kailing, K., Kriegel, H.P., Kroger, P.: Density-Connected Subspace Clustering for High-Dimensional Data. In: Jonker, W., Petković, M. (eds.) SDM 2004. LNCS, vol. 3178, pp. 246–257. Springer, Heidelberg (2004)Google Scholar
  7. 7.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining, pp. 226–231 (1996)Google Scholar
  8. 8.
    Brecheisen, S., Kriegel, H.P., Pfeifle, M.: Efficient density-based clustering of complex objects. In: Proc. of Fourth IEEE International Conference on Data Mining, pp. 43–50 (2004)Google Scholar
  9. 9.
    Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: Proc. of 1996 ACM SIGMOD Int. Conf. on Management of Data, pp. 1–12 (1996)Google Scholar
  10. 10.
    Wang, K., Hock, S., Tay, W., Liu, B.: Interestingness-based interval merger for numeric association rules. Proc. of 4th Int. Conf. on Knowledge Discovery and Data Mining (KDD) (1998) 121–128Google Scholar
  11. 11.
    Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proc. of Fourth International Conference on Knowledge Discovery and Data Mining (1998)Google Scholar
  12. 12.
    Li, W., Han, J., Pei, J.: Cmar: Accurate and efficient classification based on multiple class-association rules. In: Proc. of First IEEE International Conference on Data Mining, pp. 369–376 (2001)Google Scholar
  13. 13.
    Dong, G., Zhang, X., Wong, L., Li, J.: CAEP: Classification by aggregating emerging patterns. In: Arikawa, S., Furukawa, K. (eds.) DS 1999. LNCS (LNAI), vol. 1721, pp. 30–42. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  14. 14.
    Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)MATHCrossRefGoogle Scholar
  15. 15.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar
  16. 16.
    Okada, T.: Guide to the Mutagenicity Data Set. In: KDD Challenge 2000 in PKDD2000: The Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases (2000),

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Takashi Washio
    • 1
  • Koutarou Nakanishi
    • 1
  • Hiroshi Motoda
    • 1
  • Takashi Okada
    • 2
  1. 1.I.S.I.R.Osaka University 
  2. 2.Kwansei Gakuin University 

Personalised recommendations