Advertisement

Improving the Accuracy of the Sequential Patterns-Based Classifiers

  • José K. Febrer-HernándezEmail author
  • Raudel Hernández-León
  • José Hernández-Palancar
  • Claudia Feregrino-Uribe
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9423)

Abstract

In this paper, we propose some improvements to the Sequential Patterns-based Classifiers. First, we introduce a new pruning strategy, using the Netconf as measure of interest, that allows to prune the rules search space for building specific rules with high Netconf. Additionally, a new way for ordering the set of rules based on their sizes and Netconf values, is proposed. The ordering strategy together with the “Best K rules” satisfaction mechanism allow to obtain better accuracy than SVM, J48, NaiveBayes and PART classifiers, over three document collections.

Keywords

Data mining Supervised classification Sequential patterns 

References

  1. 1.
    Quinlan, J. R.: C4.5: Programs for Machine Learning. Published by Morgan Kaufmann Publishers Inc. (1993)Google Scholar
  2. 2.
    Mannila, H., Toivonen, H., Verkamo, A.: Discovery of Frequent Episodes in Event Sequences. Data Min. Knowl. Discov. 1(3), 258–289 (1997)Google Scholar
  3. 3.
    Buddeewong, S., Kreesuradej, W.: A new association rule-based text classifier algorithm. In: Proceedings of the 17th IEEE International Conference on Tools with Artificial Intelligence, pp. 684–685 (2005)Google Scholar
  4. 4.
    Xei, F., Wu, X., Zhu, X.: Document-Specific Keyphrase Extraction Using Sequential Patterns with Wildcards. In: Proceedings of the IEEE 14th International Conference on Data Mining (2014)Google Scholar
  5. 5.
    Haleem, H., Kumar, P., Beg, S.: Novel frequent sequential patterns based probabilistic model for effective classification of web documents. In: 2014 International Conference on Computer and Communication Technology (ICCCT), pp. 361–371 (2014)Google Scholar
  6. 6.
    Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Proceeding in the 5th International Conference Extending Database Technology, pp. 3–17 (1996)Google Scholar
  7. 7.
    Pei, J., Han, J., Mortazavi-asl, B., Pinto, H., Chen, Q., Dayal U., Hsu, M.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proceedings of the 17th International Conference on Data Engineering, pp. 215–224 (2001)Google Scholar
  8. 8.
    Yang, Z., Wang, Y., Kitsuregawa, M.: LAPIN: effective sequential pattern mining algorithms by last position induction for dense databases. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 1020–1023. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  9. 9.
    Gouda, K., Hassaan, M., Zaki, M.J.: Prism: An effective approach for frequent sequence mining via prime-block encoding. J. Comput. Syst. Sci. 76(1), 88–102 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Steinbach, M., Kumar, V.: Generalizing the Notion of Confidence. In: Proceedings of the ICDM, pp. 402–409 (2005)Google Scholar
  11. 11.
    Wang, Y., Xin, Q., Coenen, F.: Hybrid Rule Ordering in Classification Association Rule Mining. Trans. MLDM 1(1), 1–15 (2008)Google Scholar
  12. 12.
    Hernández, R., Carrasco, J.A., Martínez, JFco, Hernández, J.: Combining Hybrid Rule Ordering Strategies Based on Netconf and a Novel Satisfaction Mechanism for CAR-based Classifiers. Intell. Data Anal. 18(6S), S89–S100 (2014)Google Scholar
  13. 13.
    Ahn, K.I., Kim, J.Y.: Efficient Mining of Frequent Itemsets and a Measure of Interest for Association Rule Mining. Information and Knowledge Management 3(3), 245–257 (2004)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Frank, E., Witten, I. H.: Generating Accurate Rule Sets Without Global Optimization. In: Proceedings of the 15th International Conference on Machine Learning, pp. 144–151 (1998)Google Scholar
  15. 15.
    Cortes, C., Vapnik, V.: Support-Vector Networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  16. 16.
    Hernández, R., Carrasco, J.A., Martínez, JFco, Hernández, J.: CAR-NF: A classifier based on specific rules with high netconf. Intell. Data Anal. 16(1), 49–68 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • José K. Febrer-Hernández
    • 1
    Email author
  • Raudel Hernández-León
    • 1
  • José Hernández-Palancar
    • 1
  • Claudia Feregrino-Uribe
    • 2
  1. 1.Centro de Aplicaciones de Tecnologías de Avanzada (CENATAV)PlayaCuba
  2. 2.Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE)PueblaMexico

Personalised recommendations