A Performance Comparative Analysis Between Rule-Induction Algorithms and Clustering-Based Constructive Rule-Induction Algorithms. Application to Rheumatoid Arthritis

  • J. A. Sanandrés-Ledesma
  • Victor Maojo
  • Jose Crespo
  • M. García-Remesal
  • A. Gómez de la Cámara
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3337)


We present a performance comparative analysis between traditional rule-induction algorithms and clustering-based constructive rule-induction algorithms. The main idea behind these methods is to find dependency relations among primitive variables and use them to generate new features. These dependencies, corresponding to regions in the space, can be represented as clusters of examples. Unsupervised clustering methods are proposed for searching for these dependencies. As a benchmark, a database of rheumatoid arthritis (RA) patients has been used. A set of clinical prediction rules for prognosis in RA was obtained by applying the most successful methods, selected according to the study outcomes. We suggest that it is possible to relate predictive features and long-term outcomes in RA.


Rheumatoid Arthritis Clinical Prediction Rule Inductive Learning Primitive Variable Cluster Validity Index 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Langley, P., Simon, H.: Applications of machine learning and rule induction. Communications of the ACM 38(11), 55–64 (1995)CrossRefGoogle Scholar
  2. 2.
    Michalski, R.S.: A theory and methodology of inductive learning. Artificial Intelligence 20, 111–161 (1983)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Kononenko, I., Bratko, I., Kukar, M.: Application of machine learning to medical diagnosis. In: Michalski, R.S., Bratko, I., Kubat, M. (eds.) Machine learning and data mining: Methods and applications. John Wiley & Sons Ltd., Chichester (1997)Google Scholar
  4. 4.
    Grošelj, C., Kukar, M., Fettich, J., Kononenko, I.: Machine learning improves the accuracy of coronary artery disease diagnostic methods. In: Proc. Computers in Cardiology, vol. 24, pp. 57–60 (1997)Google Scholar
  5. 5.
    Lavrač, N., Keravnou, E., Zupan, B.: Intelligent data analysis in medicine and pharmacology: An overview. In: Lavrač, N., Keravnou, E., Zupan, B. (eds.) Intelligent data analysis in medicine and pharmacology. Kluwer, Dordrecht (1997)Google Scholar
  6. 6.
    Long, W., Griffith, H., Selker, H., D’Agostino, R.: A comparison of logistic regression to decision-tree induction in a medical domain. Computers and Biomedical Research 26, 74–97 (1993)CrossRefGoogle Scholar
  7. 7.
    Michie, D., Spiegelhalter, D.: Machine learning, neural and statistical classification. Ellis Horwood (1994)Google Scholar
  8. 8.
    Weiss, S., Kapouleas, I.: An empirical comparison of pattern recognition, neural nets, and machine learning classification methods. In: Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, pp. 806–812 (1989)Google Scholar
  9. 9.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. Wiley & Sons, Chichester (2000)Google Scholar
  10. 10.
    Mitchell, T.M.: Machine learning. McGraw-Hill, New York (1997)MATHGoogle Scholar
  11. 11.
    Príncipe, J.C., Euliano, N.R., Lefevre, W.C.: Neural and adaptive systems. Wiley & Sons, Chichester (2000)Google Scholar
  12. 12.
    Cowell, R.G., Dawid, A.P., Lauritzen, S.L., Spiegelhalter, D.J.: Probabilistic networks and expert systems. Springer, Heidelberg (1999)MATHGoogle Scholar
  13. 13.
    Fahrmeir, L., Tutz, G.: Multivariate statistical modeling based on generalized linear models. Springer, Heidelberg (2001)Google Scholar
  14. 14.
    Everitt, B.: Cluster Analysis. Edward Arnold, London (1993)Google Scholar
  15. 15.
    Dunn, J.C.: Well separated clusters and optimal fuzzy partitions. Journal of Cybernetics 4, 95–104 (1974)CrossRefMathSciNetGoogle Scholar
  16. 16.
    Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence 1(4), 224–227 (1979)CrossRefGoogle Scholar
  17. 17.
    Quinlan, J.C.: C4.5: Programs for Machine Learning. Morgan Kauffman, San Mateo (1992)Google Scholar
  18. 18.
    Quinlan, J.C.: Induction of logic programs: FOIL and related systems. New Generation Computing 13, 287–312 (1995)CrossRefGoogle Scholar
  19. 19.
    Gaines, B.R.: An ounce of knowledge is worth a ton of data: quantitative studies of the trade-off between expertise and data based on statistically well-founded empirical induction. In: Proceedings of the 6th InternationalWorkshop on Machine Learning, pp. 156–159. Morgan Kauffman, San Francisco (1989)Google Scholar
  20. 20.
    Auer, P., Holte, R.C., Maas, W.: Theory and applications of agnostic pac-learning with small decision trees. Tech. Rep. NC-TR-96-034, NeuroCOLT (1996)Google Scholar
  21. 21.
    Gómez de la Cámara, A., Ciruelo Monge, E., de la Cruz Bértolo, J., Serrano Dýaz, J.M., Pato Cour, E., Gómez-Reino Carnota, J.J.: Pérdida de fiabilidad en la extracción de datos de las historias clínicas: origen de los defectos y utilidad del adiestramiento. Medicina Clínica 10(108), 377–381 (1997)Google Scholar
  22. 22.
    Ware, J.E., Sherbourne, C.D.: The MOS 36-item short-form health survey. Medical Care 30(6), 473–483 (1992)CrossRefGoogle Scholar
  23. 23.
    Estelle-Vives, J., Batlle-Gualda, E., Reig, A.: Spanish version of the Health Assessment Questionnaire: reliability, validity, and transcultural equivalency. Journal of Rheumatology 20(12), 2116–2122 (1993)Google Scholar
  24. 24.
    Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press, London (1999)Google Scholar
  25. 25.
    Glymour, C., Madigan, D., Pregibon, D., Smyth, P.: Statistical themes and lessons for data mining. Data Mining and knowledge discovery 1, 11–28 (1997)CrossRefGoogle Scholar
  26. 26.
    Wasson, J., Sox, H., Neff, R., Goldman, L.: Clinical prediction rules: applications and methodological standards. The New England Journal of Medicine 313(13), 793–799 (1985)CrossRefGoogle Scholar
  27. 27.
    Sanandrś-Ledesma, J.A., Maojo, V., Crespo, J., Gómez de la Cámara, A.: A clustering-based constructive induction method and its application to Rheumatoid arthritis. In: Quaglini, S., Barahona, P., Andreassen, S. (eds.) AIME 2001. LNCS (LNAI), vol. 2101, pp. 59–62. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  28. 28.
    Maojo, V., Crespo, J., Sanandrés-Ledesma, J.A., Billhardt, H.: Computational intelligence techniques in medical diagnosis and prognosis: the data mining perspective. In: Jain, L. (ed.) Computational Intelligence Techniques in Medicine. Springer, Heidelberg (in press)Google Scholar
  29. 29.
    Maojo, V., Sanandrés-Ledesma, J.A.: A survey of data mining techniques. In: Brause, R., Hanisch, E. (eds.) ISMDA 2000. LNCS, vol. 1933, pp. 17–22. Springer, Heidelberg (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • J. A. Sanandrés-Ledesma
    • 1
  • Victor Maojo
    • 1
  • Jose Crespo
    • 1
  • M. García-Remesal
    • 1
  • A. Gómez de la Cámara
    • 2
  1. 1.Medical Informatics Group, AI Lab, School of Computer SciencePolytechnical University of MadridSpain
  2. 2.Clinical Epidemiology Research UnitHospital 12 de OctubreMadridSpain

Personalised recommendations