Skip to main content

A New Approach for Association Rule Mining and Bi-clustering Using Formal Concept Analysis

  • Conference paper
Machine Learning and Data Mining in Pattern Recognition (MLDM 2012)

Abstract

Association rule mining and bi-clustering are data mining tasks that have become very popular in many application domains, particularly in bioinformatics. However, to our knowledge, no algorithm was introduced for performing these two tasks in one process. We propose a new approach called FIST for extracting bases of extended association rules and conceptual bi-clusters conjointly. This approach is based on the frequent closed itemsets framework and requires a unique scan of the database. It uses a new suffix tree based data structure to reduce memory usage and improve the extraction efficiency, allowing parallel processing of the tree branches. Experiments conducted to assess its applicability to very large datasets show that FIST memory requirements and execution times are in most cases equivalent to frequent closed itemsets based algorithms and lower than frequent itemsets based algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Fast algorithm for mining association rules in large databases. In: Proc. VLDB, pp. 487–499 (1994)

    Google Scholar 

  2. Ceglar, A., Roddick, J.: Association mining. ACM Computing Surveys 38 (2006)

    Google Scholar 

  3. Eisen, M., Spellman, P., Brown, P.O., Botstein, D.: Cluster analysis and display of genome wide expression patterns. Proc. Natl. Acad. Sci. USA 95(25), 14863–14868 (1998)

    Article  Google Scholar 

  4. Fu, W., Sanders-Beer, B., Katz, K., Maglott, D., Pruitt, K., Ptak, R.: Human immunodeficiency virus type 1, human protein interaction database at NCBI. Nucleic Acids Research 37, 417–422 (2009)

    Article  Google Scholar 

  5. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer (1999)

    Google Scholar 

  6. Hamrouni, T., Ben Yahia, S., Mephu Nguifo, E.: Succinct System of Minimal Generators: A Thorough Study, Limitations and New Definitions. In: Yahia, S.B., Nguifo, E.M., Belohlavek, R. (eds.) CLA 2006. LNCS (LNAI), vol. 4923, pp. 80–95. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  7. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Series in Data Management Systems (2011)

    Google Scholar 

  8. Han, J., Pei, J.: Mining frequent patterns by pattern-growth: Methodology and implications. SIGKDD Explor. Newsl. 2(2), 14–20 (2000)

    Article  Google Scholar 

  9. Madeira, S., Oliveira, A.: Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans. Comput. Biol. Bioinformatics 1, 24–45 (2004)

    Article  Google Scholar 

  10. Madeira, S., Oliveira, A.: A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series. Algorithms for Molecular Biology 4(8) (2009)

    Google Scholar 

  11. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Systems 24(1), 25–46 (1999)

    Article  Google Scholar 

  12. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Closed sets based discovery of small covers for association rules. Network. and Inf. Systems 3(2), 349–377 (2001)

    Google Scholar 

  13. Pasquier, N., Taouil, R., Bastide, Y., Stumme, G., Lakhal, L.: Generating a condensed representation for association rules. Journal of Intelligent Information Systems 24(1), 29–60 (2005)

    Article  MATH  Google Scholar 

  14. Peeters, R.: The maximum edge biclique problem is NP-complete. Discrete Applied Mathematics 131(3), 651–654 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  15. Ptak, R., Fu, W., Sanders-Beer, B., Dickerson, J., Pinney, J., Robertson, D., Rozanov, M., Katz, K., Maglott, D., Pruitt, K., Dieffenbach, C.: Cataloguing the HIV-1 human protein interaction network. AIDS Research and Human Retroviruses 4(12), 1497–1502 (2008)

    Article  Google Scholar 

  16. Shekofteh, M.: A survey of algorithms in FCIM. In: Proc. DSDE, pp. 29–33 (2010)

    Google Scholar 

  17. Yahia, S.B., Hamrouni, T., Nguifo, E.M.: Frequent closed itemset based algorithms: A thorough structural and analytical survey. SIGKDD Explorations 8, 93–104 (2006)

    Article  Google Scholar 

  18. Zaki, M.J.: Generating non-redundant association rules. In: Proc. SIGKDD, pp. 34–43 (2000)

    Google Scholar 

  19. Zaki, M.J., Hsiao, C.J.: CHARM: An efficient algorithm for closed itemset mining. In: Proc. SIAM, pp. 457–473 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mondal, K.C., Pasquier, N., Mukhopadhyay, A., Maulik, U., Bandhopadyay, S. (2012). A New Approach for Association Rule Mining and Bi-clustering Using Formal Concept Analysis. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2012. Lecture Notes in Computer Science(), vol 7376. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31537-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31537-4_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31536-7

  • Online ISBN: 978-3-642-31537-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics