Skip to main content

Finding Rare Patterns with Weak Correlation Constraint: Progress in Indicative and Chance Patterns

  • Chapter
Advances in Chance Discovery

Part of the book series: Studies in Computational Intelligence ((SCI,volume 423))

Abstract

A notion of rare patterns has been recently paid attention in several research fields including Chance Discovery, Formal Concept Analysis and Data Mining. In this paper, we overview the progress of our investigations on rare patterns satisfying a weak-correlation constraint. A rare pattern must indicate some significance as well as a fact that the number of its instances is a few. We pay our attention to a pattern as an itemset in a transaction database which consists of several general items, but has a very small degree of correlation in spite of the generality of component items. Such a pattern is called an indicative pattern and is regarded as a rare pattern to be extracted.

In order to exclude trivial patterns of general items with few instances, we introduce an objective function for taking into account both the generality of component items and the number of instances as objective evidences. Then we try to find indicative patterns with the Top-N evaluation values under a constraint that the degree of correlation must not exceed a given upper bound.

For making a hidden relationship between a pair of more frequent patterns visible, the framework of finding Top-N indicative patterns is then extended by imposing some structural constraints to our indicative pattern and larger patterns bridged by it. As a recent progress in this direction, we briefly present a framework of finding chance patterns with KeyGraph \(^{\text{\tiny \textregistered}}\)-based importance as well as some experimental result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Han, J., Cheng, H., Xin, D., Yan, X.: Frequent Pattern Mining - Current Status and Future Directions. Data Mining and Knowledge Discovery 15(1), 55–86 (2007)

    Google Scholar 

  2. Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: Efficient Mining Algorithm for Frequent/Closed/Maximal Itemsets. In: Proc. of IEEE ICDM 2004 Workshop - FIMI 2004 (2004), http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS//Vol-126/

  3. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of the 20th Int’l Conf. on Very Large Databases, VLDB 1994, pp. 487–499 (1994)

    Google Scholar 

  4. Wang, J., Han, J., Pei, J.: CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets. In: Proc. of the 9th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, KDD 2003, pp. 236–245 (2003)

    Google Scholar 

  5. Gan, G., Ma, C., Wu, J.:Data Clustering – Theory, Algorithms, and Applications. SIAM (2007)

    Google Scholar 

  6. Szathmary, L., Napoli, A., Valtchev, P.: Towards Rare Itemset Mining. In: Proc. of the 19th IEEE Int’l Conf. on Tools with Artificial Intelligence, ICTAI 2007, pp. 305–312 (2007)

    Google Scholar 

  7. Troiano, L., Scibelli, G., Birtolo, C.: A Fast Algorithm for Mining Rare Itemsets. In: Proc. of the 2009 9th Int’l Conf. on Intelligent Systems Design and Applications, ISDA 2009, pp. 1149–1155 (2009)

    Google Scholar 

  8. Weiss, G.M.: Mining with Rarity: A Unifying Framework. SIGKDD Explorations 6(1), 7–19 (2004)

    Article  Google Scholar 

  9. Ohsawa, Y.: Discovery of Chances Underlying Real Data. In: Arikawa, S., Shinohara, A. (eds.) Progress in Discovery Science. LNCS (LNAI), vol. 2281, pp. 168–177. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  10. Ohsawa, Y., Benson, N.E., Yachida, M.: KeyGraph: Automatic Indexing by Segmenting and Unifing Co-occurrence Graphs. The IEICE Transactions on Information and Systems (Japanese Edition) J82-D-I(2), 391–400 (1999)

    Google Scholar 

  11. Maeno, Y., Ohsawa, Y.: Human-Computer Interactive Annealing for Discovering Invisible Dark Events. IEEE Transactions on Industrial Electronics 54(2), 1184–1192 (2007)

    Article  Google Scholar 

  12. Ohsawa, Y., Yachida, M.: Discover Risky Active Faults by Indexing an Earthquake Sequence. In: Arikawa, S., Nakata, I. (eds.) DS 1999. LNCS (LNAI), vol. 1721, pp. 208–219. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  13. Tomita, E., Akutsu, T., Matsunaga, T.: Efficient Algorithms for Finding Maximum and Maximal Cliques: Effective Tools for Bioinformatics. In: Biomedical Engineering. Trends in Electronics, Communications and Software, pp. 625–640. InTech (2011)

    Google Scholar 

  14. Newman, M.E.J.: Finding Community Structure in Networks Using the Eigenvectors of Matrices. Physical Review, E 74, 036104 (2006)

    Article  Google Scholar 

  15. Barber, M.J., Clark, J.W.: Detecting Network Communities by Propagating Labels under Constraints. Physical Review, E 80, 026129 (2009)

    Article  Google Scholar 

  16. Omiecinski, E.R.: Alternative Interest Measures for Mining Associations in Databases. IEEE Transactions on Knowledge and Data Engineering 15(1), 57–69 (2003)

    Article  MathSciNet  Google Scholar 

  17. Bay, S.D., Pazzani, M.J.: Detecting Group Differences: Mining Contrast Sets. Data Mining and Knowledge Discovery 5(3), 213–246 (2001)

    Google Scholar 

  18. Dong, G., Li, J.: Efficient Mining of Emerging Patterns: Discovering Trends and Differences. In: Proc. of the 5th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, KDD 1999, pp. 43–52 (2000)

    Google Scholar 

  19. Alhammady, H., Ramamohanarao, K.: Using Emerging Patterns and Decision Trees in Rare-Class Classification. In: Proc. of the 4th IEEE Int’l Conf. on Data Mining, ICDM 2004, pp. 315–318 (2004)

    Google Scholar 

  20. Novak, P.K., Lavrac, N.: Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining. The Journal of Machine Learning Research Archive 10, 377–403 (2009)

    MATH  Google Scholar 

  21. Geng, L., Hamilton, H.J.: Interestingness Measures for Data Mining: A Survey. ACM Computing Surveys 38(3), Article 9 (2006)

    Google Scholar 

  22. Brin, S., Motwani, R., Silverstein, C.: Beyond Market Basket: Generalizing Association Rules to Correlations. In: Proc. of the ACM Int’l Conf. on Management of Data, SIGMOD 1997, pp. 265–276 (1997)

    Google Scholar 

  23. Ganter, B., Wille, R.: Formal Concept Analysis - Mathematical Foundations, 284 pages. Springer (1999)

    Google Scholar 

  24. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient Mining of Association Rules Using Closed Itemset Lattices. Information Systems 24(1), 25–46 (1999)

    Article  Google Scholar 

  25. Nakajima, T.: Finding Concise Rare Concepts by Excavation of Pattern Pools Based on Local Branch-and-Bound Searches, Master Thesis, Graduate School of Information Science and Technology, Hokkaido University (2010) (in Japanese)

    Google Scholar 

  26. Haraguchi, M., Okubo, Y.: Pinpoint Clustering of Web Pages and Mining Implicit Crossover Concepts. In: Web Intelligence and Intelligent Agents, pp. 391–410. InTech (2010)

    Google Scholar 

  27. Li, A., Haraguchi, M., Okubo, Y.: Implicit Groups of Web Pages as Constrained Top-N Concepts. In: Proc. of the 2008 IEEE/WIC/ACM Int’l Conf. on Web Intelligence and Intelligent Agent Technology Workshops, pp. 190–194 (2008)

    Google Scholar 

  28. Taniguchi, T., Haraguchi, M.: Discovery of Hidden Correlations in a Local Transaction Database Based on Differences of Correlations. Engineering Application of Artificial Intelligence 19(4), 419–428 (2006)

    Article  Google Scholar 

  29. Li, A., Haraguchi, M., Okubo, Y.: Contrasting Correlations by an Efficient Double-Clique Condition. In: Perner, P. (ed.) MLDM 2011. LNCS(LNAI), vol. 6871, pp. 469–483. Springer, Heidelberg (2011)

    Google Scholar 

  30. Okubo, Y., Haraguchi, M., Nakajima, T.: Finding Rare Patterns with Weak Correlation Constraint. In: Proceedings of the 2010 IEEE International Conference on Data Mining Workshops, ICDMW 2010, pp. 822–829 (2010)

    Google Scholar 

  31. Okubo, Y., Haraguchi, M., Hirokawa, S.: Finding Top-N Chance Patterns with KeyGraph\(^{\tiny \textregistered}\)-Based Importance. In: König, A., Dengel, A., Hinkelmann, K., Kise, K., Howlett, R.J., Jain, L.C. (eds.) KES 2011, Part II. LNCS(LNAI), vol. 6882, pp. 457–468. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  32. Okubo, Y., Haraguchi, M.: An Algorithm for Finding Indicative Concepts Connecting Larger Concepts Based on Structural Constraints. In: Contributions to ICFCA 2011, The 9th Int’l Conf. on Formal Concept Analysis, ICFCA 2011, pp. 53–68 (2011)

    Google Scholar 

  33. Okubo, Y., Haraguchi, M.: An Algorithm for Extracting Rare Concepts with Concise Intents. In: Kwuida, L., Sertkaya, B. (eds.) ICFCA 2010. LNCS(LNAI), vol. 5986, pp. 145–160. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yoshiaki Okubo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Okubo, Y., Haraguchi, M., Nakajima, T. (2013). Finding Rare Patterns with Weak Correlation Constraint: Progress in Indicative and Chance Patterns. In: Ohsawa, Y., Abe, A. (eds) Advances in Chance Discovery. Studies in Computational Intelligence, vol 423. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30114-8_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30114-8_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30113-1

  • Online ISBN: 978-3-642-30114-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics