Soft Computing

, Volume 16, Issue 5, pp 883–901 | Cite as

Mining fuzzy association rules from low-quality data

Focus

Abstract

Data mining is most commonly used in attempts to induce association rules from databases which can help decision-makers easily analyze the data and make good decisions regarding the domains concerned. Different studies have proposed methods for mining association rules from databases with crisp values. However, the data in many real-world applications have a certain degree of imprecision. In this paper we address this problem, and propose a new data-mining algorithm for extracting interesting knowledge from databases with imprecise data. The proposed algorithm integrates imprecise data concepts and the fuzzy apriori mining algorithm to find interesting fuzzy association rules in given databases. Experiments for diagnosing dyslexia in early childhood were made to verify the performance of the proposed algorithm.

Keywords

Data mining Fuzzy association rules Low-quality data 

Notes

Acknowledgments

This study was supported by the Spanish Ministry of Education and Science under Grants no. TIN2008-06681-C06-{01 and 04}, TIN2011-28488 and by the Principado de Asturias under Grant PCTI 2006–2009.

References

  1. Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: SIGMOD, Washington, D.C., USA, pp 207–216Google Scholar
  2. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: International conference on very large data bases, Santiago de Chile, pp 487–499Google Scholar
  3. Ajuriaguerra J (1976) Manual de psiquiatría infantil. Barcelona, Toray-MassonGoogle Scholar
  4. Alatas B, Akin E (2006) An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules. Soft Comput Fusion Found Methodol Appl 10(3):230–237Google Scholar
  5. Alcala-Fdez J, Fernandez A, Luego J, Derrac J, Garcia S, Sanchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Log Soft Comput 17(2–3):255–287Google Scholar
  6. Alcala-Fdez J, Flugy-Pape N, Bonarini A, Herrera F (2010) Analysis of the effectiveness of the genetic algorithms based on extraction of association rules. Fundamenta Informaticae 98(1):1–14MathSciNetGoogle Scholar
  7. Baudrit C, Dubois D, Perror N (2008) Representing parametric probabilistic models tainted with imprecision. Fuzzy Sets Syst 15(1):1913–1928CrossRefGoogle Scholar
  8. Bertoluzza C, Gil M, Ralescu D (2003) Statistical modeling. Analysis and management of fuzzy data. Springer, BerlinGoogle Scholar
  9. Chen C, Hong T, Tseng V (2011) Genetic-fuzzy mining with multiple minimum supports based on fuzzy clustering. Soft Comput Fusion Found Methodol Appl. doi:10.1007/s00500-010-0664-1
  10. Couso I, Sanchez L (2008) Higher order models for fuzzy random variables. Fuzzy Sets Syst 159:237–258MathSciNetMATHCrossRefGoogle Scholar
  11. Delgado M, Marín N, Sánchez D, Vila M (2003) Fuzzy association rules: general model and applications. IEEE Trans Fuzzy Syst 11(2):214–225CrossRefGoogle Scholar
  12. Dubois D, Hullermeier E, Prade H (2006) A systematic approach to the assessment of fuzzy association rules. Data Min Knowl Disc 13(2):167–192MathSciNetCrossRefGoogle Scholar
  13. Dubois D, Prade H (1992) When upper probabilities are possibility measures. Fuzzy Sets Syst 49:65–74MathSciNetMATHCrossRefGoogle Scholar
  14. Dubois D, Prade H, Sudamp T (2005) On the representation, measurement, and discovery of fuzzy associations. IEEE Trans Fuzzy Syst 13(2):250–262CrossRefGoogle Scholar
  15. Han J, Kamber M (2006) Data mining: concepts and techniques, 2nd edn. Morgan Kaufmann, San FransiscoMATHGoogle Scholar
  16. Han J, Pei J, Yin Y (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Dis 8(1):53–87MathSciNetCrossRefGoogle Scholar
  17. Hong T, Kuo C, Chi S (1999) Mining association rules from quantitative data. Intell Data Anal 3(5):363–376MATHCrossRefGoogle Scholar
  18. Hong T, Kuo C, Chi S (2001) Trade-off between time complexity and number of rules for fuzzy mining from quantitative data. Int J Uncertain Fuzziness Knowl Based Syst 9(5):587–604MATHGoogle Scholar
  19. Hong T, Lee Y (2008) An overview of mining fuzzy association rules. In: Bustince H, Herrera F, Montero J (eds) Studies in fuzziness and soft computing, vol 220. Springer, Berlin, pp 397–410Google Scholar
  20. Hullermeier E, Yi Y (2007) In defense of fuzzy association analysis. IEEE Trans Syst Man Cybern Part B Cybern 37(4):1039–1043CrossRefGoogle Scholar
  21. Kaufmann A, Gupta M (1991) Introduction to fuzzy arithmetic: theory and applications. Van Nostrand Reinhold, New YorkMATHGoogle Scholar
  22. Kaya M (2006) Multi-objective genetic algorithm based approaches for mining optimized fuzzy association rules. Soft Comput Fusion Found Methodol Appl 10(7):578–586MathSciNetMATHGoogle Scholar
  23. Limbourg P (2005) Multi-objective optimization of problems with epistemic uncertainty. In Proceedings of EMO, pp 413–427Google Scholar
  24. Mladenic D, Lavrac N, Bohanec M, Moyle S (2002) Data mining and decision support: integration and collaboration. Kluwer, NorwellGoogle Scholar
  25. Palacios A, Sanchez L, Couso I (2011) Future performance modelling in athletism with low quality data-based GFSs. J Multiple-Valued Log Soft Comput 17(2–3):207–228Google Scholar
  26. Ruspini E (1969) A new approach to clustering. Inf Control 15:22–32MATHCrossRefGoogle Scholar
  27. Sanchez L, Couso I, Casillas J (2007) Modelling vague data with genetic fuzzy systems under a combination of crisp and imprecise criteria. In: IEEE symposium on computational intelligence inmulticriteria decision making, pp 30–37Google Scholar
  28. Sanchez L, Couso I, Casillas J (2009) Genetic learning of fuzzy rules on low quality data. Fuzzy Sets Syst 160(17):2524–2552MathSciNetMATHCrossRefGoogle Scholar
  29. Sanchez L, Suarez M, Villar J, Couso I (2008) Mutual information-based feature selection and partition design in fuzzy rule-based classifiers from vague data. Int J Approx Reason 49:607–622CrossRefGoogle Scholar
  30. Sudkamp T (2005) Examples, counterexamples, and measuring fuzzy associations. Fuzzy Sets Syst 149(1):57–71MathSciNetMATHCrossRefGoogle Scholar
  31. Sun K, Fengshan B (2008) Mining weighted association rules without preassigned weights. IEEE Trans Knowl Data Eng 20(4):489–495CrossRefGoogle Scholar
  32. Thomson P, Gilchrist P (1996) Dyslexia: a multidisciplinary approach. Chapman and Hall, LondonGoogle Scholar
  33. Toro J, Cervera M (1980) TALE Test de Análisis de la lectoescritura. Pablo del Río, MadridGoogle Scholar
  34. Villar J, Otero A, Otero J, Sanchez L (2009) Taximeter verification using imprecise data from gps and multiobjective algorithms. Eng Appl Artif Intell 22:250–260CrossRefGoogle Scholar
  35. Vinuessa M, Coll J (1984) Tratado de atletismo. Servicio Geográfico del EjércitoGoogle Scholar
  36. Wu B, Sun C (2001) Interval-valued statistics, fuzzy logic, and their use in computational semantics. J Intell Fuzzy Syst 1–2(11):1–7Google Scholar
  37. Zhang C, Zhang S (2002) Association rule mining: models and algorithms. Springer, BerlinMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • A. M. Palacios
    • 1
  • M. J. Gacto
    • 2
  • J. Alcalá-Fdez
    • 3
  1. 1.Department of Computer ScienceUniversity of OviedoGijónSpain
  2. 2.Department of Computer ScienceUniversity of JaénJaénSpain
  3. 3.Department of Computer Science and Artificial Intelligence, CITIC-UGRUniversity of GranadaGranadaSpain

Personalised recommendations