Abstract
Association rule mining and bi-clustering are data mining tasks that have become very popular in many application domains, particularly in bioinformatics. However, to our knowledge, no algorithm was introduced for performing these two tasks in one process. We propose a new approach called FIST for extracting bases of extended association rules and conceptual bi-clusters conjointly. This approach is based on the frequent closed itemsets framework and requires a unique scan of the database. It uses a new suffix tree based data structure to reduce memory usage and improve the extraction efficiency, allowing parallel processing of the tree branches. Experiments conducted to assess its applicability to very large datasets show that FIST memory requirements and execution times are in most cases equivalent to frequent closed itemsets based algorithms and lower than frequent itemsets based algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: Fast algorithm for mining association rules in large databases. In: Proc. VLDB, pp. 487–499 (1994)
Ceglar, A., Roddick, J.: Association mining. ACM Computing Surveys 38 (2006)
Eisen, M., Spellman, P., Brown, P.O., Botstein, D.: Cluster analysis and display of genome wide expression patterns. Proc. Natl. Acad. Sci. USA 95(25), 14863–14868 (1998)
Fu, W., Sanders-Beer, B., Katz, K., Maglott, D., Pruitt, K., Ptak, R.: Human immunodeficiency virus type 1, human protein interaction database at NCBI. Nucleic Acids Research 37, 417–422 (2009)
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer (1999)
Hamrouni, T., Ben Yahia, S., Mephu Nguifo, E.: Succinct System of Minimal Generators: A Thorough Study, Limitations and New Definitions. In: Yahia, S.B., Nguifo, E.M., Belohlavek, R. (eds.) CLA 2006. LNCS (LNAI), vol. 4923, pp. 80–95. Springer, Heidelberg (2008)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Series in Data Management Systems (2011)
Han, J., Pei, J.: Mining frequent patterns by pattern-growth: Methodology and implications. SIGKDD Explor. Newsl. 2(2), 14–20 (2000)
Madeira, S., Oliveira, A.: Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans. Comput. Biol. Bioinformatics 1, 24–45 (2004)
Madeira, S., Oliveira, A.: A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series. Algorithms for Molecular Biology 4(8) (2009)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Systems 24(1), 25–46 (1999)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Closed sets based discovery of small covers for association rules. Network. and Inf. Systems 3(2), 349–377 (2001)
Pasquier, N., Taouil, R., Bastide, Y., Stumme, G., Lakhal, L.: Generating a condensed representation for association rules. Journal of Intelligent Information Systems 24(1), 29–60 (2005)
Peeters, R.: The maximum edge biclique problem is NP-complete. Discrete Applied Mathematics 131(3), 651–654 (2003)
Ptak, R., Fu, W., Sanders-Beer, B., Dickerson, J., Pinney, J., Robertson, D., Rozanov, M., Katz, K., Maglott, D., Pruitt, K., Dieffenbach, C.: Cataloguing the HIV-1 human protein interaction network. AIDS Research and Human Retroviruses 4(12), 1497–1502 (2008)
Shekofteh, M.: A survey of algorithms in FCIM. In: Proc. DSDE, pp. 29–33 (2010)
Yahia, S.B., Hamrouni, T., Nguifo, E.M.: Frequent closed itemset based algorithms: A thorough structural and analytical survey. SIGKDD Explorations 8, 93–104 (2006)
Zaki, M.J.: Generating non-redundant association rules. In: Proc. SIGKDD, pp. 34–43 (2000)
Zaki, M.J., Hsiao, C.J.: CHARM: An efficient algorithm for closed itemset mining. In: Proc. SIAM, pp. 457–473 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mondal, K.C., Pasquier, N., Mukhopadhyay, A., Maulik, U., Bandhopadyay, S. (2012). A New Approach for Association Rule Mining and Bi-clustering Using Formal Concept Analysis. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2012. Lecture Notes in Computer Science(), vol 7376. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31537-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-31537-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31536-7
Online ISBN: 978-3-642-31537-4
eBook Packages: Computer ScienceComputer Science (R0)