A Mixed Learning Strategy for Finding Typical Testors in Large Datasets

  • Víctor Iván González-Guevara
  • Salvador Godoy-Calderon
  • Eduardo Alba-CabreraEmail author
  • Julio Ibarra-Fiallo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9423)


This paper presents a mixed, global and local, learning strategy for finding typical testors in large datasets. The goal of the proposed strategy is to allow any search algorithm to achieve the most significant reduction possible in the search space of a typical testor-finding problem. The strategy is based on a trivial classifier which partitions the search space into four distinct classes and allows the assessment of each feature subset within it. Each class is handled by slightly different learning actions, and induces a different reduction in the search-space of a problem. Any typical testor-finding algorithm, whether deterministic or metaheuristc, can be adapted to incorporate the proposed strategy and can take advantage of the learned information in diverse manners.


Feature selection Testor theory Algorithms 


  1. 1.
    Quinlan, J. R.: C4.5: Programs for Machine Learning. Published by Morgan Kaufmann Publishers Inc. (1993)Google Scholar
  2. 2.
    Mannila, H., Toivonen, H., Verkamo, A.: Discovery of Frequent Episodes in Event Sequences. Data Min. Knowl. Discov. 1(3), 258–289 (1997)Google Scholar
  3. 3.
    Buddeewong, S., Kreesuradej, W.: A new association rule-based text classifier algorithm. In: Proceedings of the 17th IEEE International Conference on Tools with Artificial Intelligence, pp. 684–685 (2005)Google Scholar
  4. 4.
    Xei, F., Wu, X., Zhu, X.: Document-Specific Keyphrase Extraction Using Sequential Patterns with Wildcards. In: Proceedings of the IEEE 14th International Conference on Data Mining (2014)Google Scholar
  5. 5.
    Haleem, H., Kumar, P., Beg, S.: Novel frequent sequential patterns based probabilistic model for effective classification of web documents. In: 2014 International Conference on Computer and Communication Technology (ICCCT), pp. 361–371 (2014)Google Scholar
  6. 6.
    Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Proceeding in the 5th International Conference Extending Database Technology, pp. 3–17 (1996)Google Scholar
  7. 7.
    Pei, J., Han, J., Mortazavi-asl, B., Pinto, H., Chen, Q., Dayal U., Hsu, M.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proceedings of the 17th International Conference on Data Engineering, pp. 215–224 (2001)Google Scholar
  8. 8.
    Yang, Z., Wang, Y., Kitsuregawa, M.: LAPIN: effective sequential pattern mining algorithms by last position induction for dense databases. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 1020–1023. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  9. 9.
    Gouda, K., Hassaan, M., Zaki, M.J.: Prism: An effective approach for frequent sequence mining via prime-block encoding. J. Comput. Syst. Sci. 76(1), 88–102 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Steinbach, M., Kumar, V.: Generalizing the Notion of Confidence. In: Proceedings of the ICDM, pp. 402–409 (2005)Google Scholar
  11. 11.
    Wang, Y., Xin, Q., Coenen, F.: Hybrid Rule Ordering in Classification Association Rule Mining. Trans. MLDM 1(1), 1–15 (2008)Google Scholar
  12. 12.
    Hernández, R., Carrasco, J.A., Martínez, JFco, Hernández, J.: Combining Hybrid Rule Ordering Strategies Based on Netconf and a Novel Satisfaction Mechanism for CAR-based Classifiers. Intell. Data Anal. 18(6S), S89–S100 (2014)Google Scholar
  13. 13.
    Ahn, K.I., Kim, J.Y.: Efficient Mining of Frequent Itemsets and a Measure of Interest for Association Rule Mining. Information and Knowledge Management 3(3), 245–257 (2004)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Frank, E., Witten, I. H.: Generating Accurate Rule Sets Without Global Optimization. In: Proceedings of the 15th International Conference on Machine Learning, pp. 144–151 (1998)Google Scholar
  15. 15.
    Cortes, C., Vapnik, V.: Support-Vector Networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  16. 16.
    Hernández, R., Carrasco, J.A., Martínez, JFco, Hernández, J.: CAR-NF: A classifier based on specific rules with high netconf. Intell. Data Anal. 16(1), 49–68 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Víctor Iván González-Guevara
    • 1
  • Salvador Godoy-Calderon
    • 1
  • Eduardo Alba-Cabrera
    • 2
    Email author
  • Julio Ibarra-Fiallo
    • 2
  1. 1.Instituto Politécnico NacionalCentro de Investigación en Computación (CIC)Ciudad de MexicoMexico
  2. 2.Colegio de Ciencias e Ingenierías, Departamento de MatemáticasUniversidad San Francisco de Quito (USFQ)QuitoEcuador

Personalised recommendations