RP-Miner: a relaxed prune algorithm for frequent similar pattern mining

Rodríguez-González, Ansel Yoan; Martínez-Trinidad, José Francisco; Carrasco-Ochoa, Jesús Ariel; Ruiz-Shulcloper, José

doi:10.1007/s10115-010-0309-9

RP-Miner: a relaxed prune algorithm for frequent similar pattern mining

Regular Paper
Published: 16 June 2010

Volume 27, pages 451–471, (2011)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Ansel Yoan Rodríguez-González^1,2,
José Francisco Martínez-Trinidad²,
Jesús Ariel Carrasco-Ochoa² &
…
José Ruiz-Shulcloper³

205 Accesses
11 Citations
Explore all metrics

Abstract

Most of the current algorithms for mining frequent patterns assume that two object subdescriptions are similar if they are equal, but in many real-world problems some other ways to evaluate the similarity are used. Recently, three algorithms (ObjectMiner, STreeDC-Miner and STreeNDC-Miner) for mining frequent patterns allowing similarity functions different from the equality have been proposed. For searching frequent patterns, ObjectMiner and STreeDC-Miner use a pruning property called Downward Closure property, which should be held by the similarity function. For similarity functions that do not meet this property, the STreeNDC-Miner algorithm was proposed. However, for searching frequent patterns, this algorithm explores all subsets of features, which could be very expensive. In this work, we propose a frequent similar pattern mining algorithm for similarity functions that do not meet the Downward Closure property, which is faster than STreeNDC-Miner and loses fewer frequent similar patterns than ObjectMiner and STreeDC-Miner. Also we show the quality of the set of frequent similar patterns computed by our algorithm with respect to the quality of the set of frequent similar patterns computed by the other algorithms, in a supervised classification context.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Novel Classification Algorithm Based on the Synergy Between Dynamic Clustering with Adaptive Distances and K-Nearest Neighbors

Article Open access 11 May 2024

A Short Review on Different Clustering Techniques and Their Applications

The pattern frequency distribution theory: a mathematic establishment toward rational and reliable pattern mining

Article 20 August 2022

References

Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data, pp 207–216
Agrawal R, Gehrke J, Gunopulos D, Raghavan P (1998) Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM-SIGMOD international conference management of data, pp 94–105
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases, pp 487–499
Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the 1995 international conference on data engineering, pp 3–14
Cheng J, Ke Y, Ng W (2008) A survey on algorithms for mining frequent itemsets over data streams. Knowl Inf Syst 16: 1–27
Article MathSciNet Google Scholar
Dánger R, Ruiz-Shulcloper J, Berlanga R (2004) Objectminer: a new approach for mining complex objects. In: Proceedings of the sixth international conference on enterprise information systems, pp 42–47
Gómez J, Rodríguez O, Valladares S, Ruiz-Shulcloper J et al (1994) Prognostic of gas-oil deposits in the Cuban Ophiological Association. Applying mathematical modeling. Geophys Int 33: 447–467
Google Scholar
Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15: 55–86
Article MathSciNet Google Scholar
Han J, Dong G, Yin Y (1999) Efficient mining of partial periodic patterns in time series database. In: Proceedings of the 1999 international conference data on engineering, pp 106–115
Iváncsy R, Vajk I (2006) Frequent pattern mining in web log data. Acta Polytechnica Hungarica. J Appl Sci Bp 1: 77–90
Google Scholar
Kelil A, Wang S, Jiang Q, Brzezinski R (2009) A general measure of similarity for categorical sequences. Knowl Inf Syst. doi:10.1007/s10115-009-0237-8
LaRosa C, Xiong L, Mandelberg K (2008) Frequent pattern mining for kernel trace data. In: Proceedings of the 2008 ACM symposium on applied computing, pp 880–885
Li J, Fu AW, Fahey P (2009) Efficient discovery of risk patterns in medical data. Artif Intell Med 45: 77–89
Article Google Scholar
Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: Proceedings of the 1998 international conference on knowledge discovery and data mining, pp 80–86
Lopez FJ, Blanco A, Garcia F, Cano C, Marin A (2008) Fuzzy association rules for biological data analysis: a case study on yeast. BMC Bioinform 9: 107
Article Google Scholar
Mannila H, Toivonen H, Verkamo AI (1997) Discovery of frequent episodes in event sequences. Data Min Knowl Discov 1: 259–289
Article Google Scholar
Martínez-Trinidad JF, Ruiz-Shulcloper J, Lazo-Cortés MS (2000) Structuralization of universes. Fuzzy Sets Syst 112: 485–500
Article Google Scholar
Ortiz-Posadas MR, Vega-Alvarado L, Toni B (2004) A similarity function to evaluate the orthodontic condition in patients with cleft lip and palate. Med Hypotheses 63: 35–41
Article Google Scholar
Ortiz-Posadas MR, Vega-Alvarado L, Toni B (2009) A mathematical function to evaluate surgical complexity of cleft lip and palate. Comput Methods Prog Biomed 94: 232–238
Article Google Scholar
Quan X, Liu G, Lu Z, Ni X, Wenyin L (2009) Short text similarity based on probabilistic topics. Knowl Inf Syst. doi:10.1007/s10115-009-0250-y
Rodríguez-González AY, Martínez-Trinidad JF, Carrasco-Ochoa JA, Ruiz-Shulcloper J (2008) Mining frequent similar patterns on mixed data. In: Ruiz-Shulcloper J, Kropatsch W (ed) Progress in pattern recognition, image analysis and applications, LNCS 5197, Springer, Berlin, pp 136–144
Ruiz-Shulcloper J, Fuentes-Rodrguez A (1981) A cybernetic model to analyze juvenile delinquency. Revista Ciencias Matemáticas 2: 123–153
Google Scholar
Silverstein C, Brin S, Motwani R, Ullman J (1998) Scalable techniques for mining causal structures. In: Proceedings of the 1998 international conference on very large data bases, pp 594–605
Wan X (2006) Beyond topical similarity: a structural similarity measure for retrieving highly similar documents. Knowl Inf Syst 15: 55–73
Article Google Scholar
Yang J, Cheungand WK, Chen X (2009) Learning element similarity matrix for semi-structured document analysis. Knowl Inf Syst 19: 53–78
Article Google Scholar
Zhang M, Kao B, Cheung DW, Yip KY (2007) Mining periodic patterns with gap requirement from sequences. ACM Trans Knowl Discov Data 1: 7
Article Google Scholar

Download references

Author information

Authors and Affiliations

Data Mining Department, Advanced Technologies Application Center, Siboney, Havana, Cuba
Ansel Yoan Rodríguez-González
Department of Computer Science, National Institute of Astrophysics, Optics and Electronics, Tonantzintla, Puebla, Mexico
Ansel Yoan Rodríguez-González, José Francisco Martínez-Trinidad & Jesús Ariel Carrasco-Ochoa
Advanced Technologies Application Center, Siboney, Havana, Cuba
José Ruiz-Shulcloper

Authors

Ansel Yoan Rodríguez-González
View author publications
You can also search for this author in PubMed Google Scholar
José Francisco Martínez-Trinidad
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Ariel Carrasco-Ochoa
View author publications
You can also search for this author in PubMed Google Scholar
José Ruiz-Shulcloper
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ansel Yoan Rodríguez-González.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rodríguez-González, A.Y., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A. et al. RP-Miner: a relaxed prune algorithm for frequent similar pattern mining. Knowl Inf Syst 27, 451–471 (2011). https://doi.org/10.1007/s10115-010-0309-9

Download citation

Received: 14 August 2009
Revised: 29 April 2010
Accepted: 22 May 2010
Published: 16 June 2010
Issue Date: June 2011
DOI: https://doi.org/10.1007/s10115-010-0309-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RP-Miner: a relaxed prune algorithm for frequent similar pattern mining

Abstract

Access this article

Similar content being viewed by others

A Novel Classification Algorithm Based on the Synergy Between Dynamic Clustering with Adaptive Distances and K-Nearest Neighbors

A Short Review on Different Clustering Techniques and Their Applications

The pattern frequency distribution theory: a mathematic establishment toward rational and reliable pattern mining

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

RP-Miner: a relaxed prune algorithm for frequent similar pattern mining

Abstract

Access this article

Similar content being viewed by others

A Novel Classification Algorithm Based on the Synergy Between Dynamic Clustering with Adaptive Distances and K-Nearest Neighbors

A Short Review on Different Clustering Techniques and Their Applications

The pattern frequency distribution theory: a mathematic establishment toward rational and reliable pattern mining

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation