Skip to main content

Algorithms for Redescription Mining

  • Chapter
  • First Online:
Redescription Mining

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

  • 367 Accesses

Abstract

The aim of redescription mining is to find valid redescriptions for given data, query language, similarity relation, and user-specified constraints. In other words, we need to explore the search space consisting of query pairs from the query language, looking for those pairs that have similar enough support in the data and that satisfy the other constraints. In this chapter, we present the different methods that have been proposed to carry out this exploration efficiently. Existing methods can be arranged into three main categories: (1) mine-and-pair approaches, (2) alternating approaches, and (3) approaches that use atomic updates. We consider each one in turn, explaining its general common principles and looking at different algorithms designed on these principles. Next, we compare the different methods and discuss their relative strengths and weaknesses. Finally, we consider how to adapt the algorithms to handle cases where some values are missing from the input data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The equation for the pessimistic Jaccard index presented by Galbrun and Miettinen (2012, Equation 5.7) is erroneous, as it misses two summands from the denominator.

References

  • Aggarwal CC (2015) Data Mining: The Textbook. Springer, Cham, https://doi.org/10.1007/978-3-319-14142-8

  • Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of 20th International Conference on Very Large Data Bases (VLDB’94), pp 487–499

    Google Scholar 

  • Blockeel H, De Raedt L, Ramon J (1998) Top-down induction of clustering trees. In: Proceedings of the 15th International Conference on Machine Learning (ICML’98), pp 55–63

    Google Scholar 

  • Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC press, Boca Raton, FL

    Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297, https://doi.org/10.1007/BF00994018

  • Galbrun E, Miettinen P (2012) From black and white to full color: Extending redescription mining outside the Boolean world. Stat Anal Data Min 5(4):284–303, https://doi.org/10.1002/sam.11145

  • Gallo A, Miettinen P, Mannila H (2008) Finding subgroups having several descriptions: Algorithms for redescription mining. In: Proceedings of the 8th SIAM International Conference on Data Mining (SDM’08), pp 334–345, https://doi.org/10.1137/1.9781611972788.30

  • Ganter B, Wille R (1999) Formal Concept Analysis: Mathematical Foundations. Springer, Berlin, https://doi.org/10.1007/978-3-642-59830-2

  • Garey MR, Johnson DS (2002) Computers and intractability. A guide to the theory of NP-completeness, vol 29. W. H. Freeman and Co., San Francisco, CA

    Google Scholar 

  • Kumar D (2007) Redescription mining: Algorithms and applications in bioinformatics. PhD thesis, Department of Computer Science, Virginia Polytechnic Institute and State University

    Google Scholar 

  • Mannila H, Toivonen H, Verkamo AI (1994) Efficient algorithms for discovering association rules. In: Proceedings of the 1994 AAAI Workshop on Knowledge Discovery in Databases (KDD’94), pp 181–192

    Google Scholar 

  • Mihelčić M, Džeroski S, Lavrač N, Šmuc T (2017) A framework for redescription set construction. Expert Syst Appl 68:196–215, https://doi.org/10.1016/j.eswa.2016.10.012

  • Mihelčić M, Džeroski S, Lavrač N, Šmuc T (2016) Redescription mining with multi-target predictive clustering trees. In: Proceedings of the 4th International Workshop on the New Frontiers in Mining Complex Patterns (NFMCP’15), pp 125–143, https://doi.org/10.1007/978-3-319-39315-5_9

  • Mihelčić M, Džeroski S, Lavrač N, Šmuc T (2017) Redescription mining augmented with random forest of multi-target predictive clustering trees. J of Intell Inf Syst pp 1–34, https://doi.org/10.1007/s10844-017-0448-5

  • Négrevergne B, Termier A, Rousset M, Méhaut J (2014) Para miner: A generic pattern mining algorithm for multi-core architectures. Data Min Knowl Disc 28(3):593–633, https://doi.org/10.1007/s10618-013-0313-2

  • Quinlan J (1986) Induction of decision trees. Mach Learn 1(1):81–106, https://doi.org/10.1023/A:1022643204877

  • Ramakrishnan N, Zaki MJ (2009) Redescription mining and applications in bioinformatics. In: Chen J, Lonardi S (eds) Biological Data Mining, Chapman and Hall/CRC, Boca Raton, FL

    Google Scholar 

  • Ramakrishnan N, Kumar D, Mishra B, Potts M, Helm RF (2004) Turning CARTwheels: An alternating algorithm for mining redescriptions. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04), pp 266–275, https://doi.org/10.1145/1014052.1014083

  • Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data En 17(4):462–478, https://doi.org/10.1109/TKDE.2005.60

  • Zaki MJ, Ramakrishnan N (2005) Reasoning about sets using redescription mining. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’05), pp 364–373, https://doi.org/10.1145/1081870.1081912

  • Zhao L, Zaki MJ, Ramakrishnan N (2006) BLOSOM: A framework for mining arbitrary Boolean expressions. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06), pp 827–832, https://doi.org/10.1145/1150402.1150511

  • Zinchenko T, Galbrun E, Miettinen P (2015) Mining predictive redescriptions with trees. In: IEEE International Conference on Data Mining Workshops, pp 1672–1675, https://doi.org/10.1109/ICDMW.2015.123

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2017 The Author(s)

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Galbrun, E., Miettinen, P. (2017). Algorithms for Redescription Mining. In: Redescription Mining. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-72889-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-72889-6_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-72888-9

  • Online ISBN: 978-3-319-72889-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics