Constraint-Based Mining of Fault-Tolerant Patterns from Boolean Data

  • Jérémy Besson
  • Ruggero G. Pensa
  • Céline Robardet
  • Jean-François Boulicaut
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3933)


Thanks to an important research effort during the last few years, inductive queries on local patterns (e.g., set patterns) and their associated complete solvers have been proved extremely useful to support knowledge discovery. The more we use such queries on real-life data, e.g., biological data, the more we are convinced that inductive queries should return fault-tolerant patterns. This is obviously the case when considering formal concept discovery from noisy datasets. Therefore, we study various extensions of this kind of bi-set towards fault-tolerance. We compare three declarative specifications of fault-tolerant bi-sets by means of a constraint-based mining approach. Our framework enables a better understanding of the needed trade-off between extraction feasibility, completeness, relevance, and ease of interpretation of these fault-tolerant patterns. An original empirical evaluation on both synthetic and real-life medical data is given. It enables a comparison of the various proposals and it motivates further directions of research.


Bacterial Meningitis Formal Concept Pattern Mining Frequent Itemsets Galois Connection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Antunes, C., Oliveira, A.L.: Constraint relaxations for discovering unknown sequential patterns. In: Goethals, B., Siebes, A. (eds.) KDID 2004. LNCS, vol. 3377, pp. 11–32. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  2. 2.
    Besson, J., Robardet, C., Boulicaut, J.-F.: Mining formal concepts with a bounded number of exceptions from transactional data. In: Goethals, B., Siebes, A. (eds.) KDID 2004. LNCS, vol. 3377, pp. 33–45. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  3. 3.
    Besson, J., Robardet, C., Boulicaut, J.-F.: Approximation de collections de concepts formels par des bi-ensembles denses et pertinents. In: Proceedings Cap 2005, pp. 313–328. PUG (2005); An extended and revised version in English is submitted to a journalGoogle Scholar
  4. 4.
    Besson, J., Robardet, C., Boulicaut, J.-F., Rome, S.: Constraint-based concept mining and its application to microarray data analysis. Intelligent Data Analysis 9(1), 59–82 (2005)Google Scholar
  5. 5.
    Bistarelli, S., Bonchi, F.: Interestingness is not a dichotomy: Introducing softness in constrained pattern mining. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 22–33. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  6. 6.
    Boulicaut, J.-F.: Inductive Databases and Multiple Uses of Frequent Itemsets: The cInQ Approach. In: Meo, R., Lanzi, P.L., Klemettinen, M. (eds.) Database Support for Data Mining Applications. LNCS (LNAI), vol. 2682, pp. 1–23. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Approximation of frequency queries by means of free-sets. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 75–85. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  8. 8.
    Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Free-sets: a condensed representation of boolean data for the approximation of frequency queries. Data Mining and Knowledge Discovery journal 7(1), 5–22 (2003)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Bucila, C., Gehrke, J.E., Kifer, D., White, W.: Dualminer: A dual-pruning algorithm for itemsets with constraints. Data Mining and Knowledge Discovery journal 7(4), 241–272 (2003)MathSciNetCrossRefGoogle Scholar
  10. 10.
    De Raedt, L.: A perspective on inductive databases. SIGKDD Explorations 4(2), 69–77 (2003)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: Proceedings ACM SIGKDD 2003, Washington, USA, pp. 89–98. ACM Press, New York (2003)Google Scholar
  12. 12.
    François, P., Robert, C., Cremilleux, B., Bucharles, C., Demongeot, J.: Variables processing in expert system building: application to the aetiological diagnosis of infantile meningitis. Med Inform 15(2), 115–124 (1990)CrossRefGoogle Scholar
  13. 13.
    Geerts, F., Goethals, B., Mielikäinen, T.: Tiling databases. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 278–289. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  14. 14.
    Gionis, A., Mannila, H., Seppänen, J.K.: Geometric and combinatorial tiles in 0–1 data. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 173–184. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  15. 15.
    Goethals, B., Zaki, M.: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, FIMI 2003. CEUR-WS, Melbourne, USA (2003)Google Scholar
  16. 16.
    Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Communications of the ACM 39(11), 58–64 (1996)CrossRefGoogle Scholar
  17. 17.
    Kuznetsov, S.O., Obiedkov, S.A.: Comparing performance of algorithms for generating concept lattices. Journal of Experimental and Theoretical Artificial Intelligence 14(2-3), 189–216 (2002)CrossRefMATHGoogle Scholar
  18. 18.
    Pei, J., Tung, A.K.H., Han, J.: Fault-tolerant frequent pattern mining: Problems and challenges. In: SIGMOD wokshop DMKD. ACM workshop (2001)Google Scholar
  19. 19.
    Pensa, R., Boulicaut, J.-F.: From local pattern mining to relevant bi-cluster characterization. In: Famili, A.F., Kok, J.N., Peña, J.M., Siebes, A., Feelders, A. (eds.) IDA 2005. LNCS, vol. 3646, pp. 293–304. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  20. 20.
    Pensa, R.G., Robardet, C., Boulicaut, J.-F.: A bi-clustering framework for categorical data. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 643–650. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  21. 21.
    Robardet, C., Crémilleux, B., Boulicaut, J.-F.: Characterization of unsupervised clusters by means of the simplest association rules: an application for child’s meningitis. In: Lyon, F. (ed.) Proceedings IDAMAP 2002 co-located with ECAI 2002, pp. 61–66 (2002)Google Scholar
  22. 22.
    Seppänen, J.K., Mannila, H.: Dense itemsets. In: Proceedings ACM SIGKDD 2004, Seattle, USA, pp. 683–688. ACM Press, New York (2004)Google Scholar
  23. 23.
    Stumme, G., Taouil, R., Bastide, Y., Pasqier, N., Lakhal, L.: Computing iceberg concept lattices with TITANIC. Journal of Data and Knowledge Engineering 42(2), 189–222 (2002)CrossRefMATHGoogle Scholar
  24. 24.
    Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concepts. In: Rival, I. (ed.) Ordered sets, pp. 445–470. Reidel, Dordrechtz (1982)CrossRefGoogle Scholar
  25. 25.
    Yang, C., Fayyad, U., Bradley, P.S.: Efficient discovery of error-tolerant frequent itemsets in high dimensions. In: Proceedings ACM SIGKDD 2001, pp. 194–203. ACM Press, New York (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jérémy Besson
    • 1
    • 2
  • Ruggero G. Pensa
    • 1
  • Céline Robardet
    • 3
  • Jean-François Boulicaut
    • 1
  1. 1.INSA Lyon, LIRIS CNRS UMR 5205VilleurbanneFrance
  2. 2.UMR INRA/INSERM 1235LyonFrance
  3. 3.INSA Lyon, PRISMAVilleurbanneFrance

Personalised recommendations