Quality Measures and Semi-automatic Mining of Diagnostic Rule Bases

  • Martin Atzmueller
  • Joachim Baumeister
  • Frank Puppe
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3392)

Abstract

Semi-automatic data mining approaches often yield better results than plain automatic methods, due to the early integration of the user’s goals. For example in the medical domain, experts are likely to favor simpler models instead of more complex models. Then, the accuracy of discovered patterns is often not the only criterion to consider. Instead, the simplicity of the discovered knowledge is of prime importance, since this directly relates to the understandability and the interpretability of the learned knowledge.

In this paper, we present quality measures considering the understandability and the accuracy of (learned) rule bases. We describe a unifying quality measure, which can trade-off small losses concerning accuracy vs. an increased simplicity. Furthermore, we introduce a semi-automatic data mining method for learning understandable and accurate rule bases. The presented work is evaluated using cases from a real world application in the medical domain.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ho, T., Saito, A., Kawasaki, S., Nguyen, D., Nguyen, T.: Failure and Success Experience in Mining Stomach Cancer Data. In: International Workshop Data Mining Lessons Learned, International Conf. Machine Learning, pp. 40–47 (2002)Google Scholar
  2. 2.
    Gamberger, D., Lavrac, N.: Expert-Guided Subgroup Discovery: Methodology and Application. Journal of Artificial Intelligence Research 17, 501–527 (2002)MATHGoogle Scholar
  3. 3.
    Huettig, M., Buscher, G., Menzel, T., Scheppach, W., Puppe, F., Buscher, H.P.: A Diagnostic Expert System for Structured Reports, Quality Assessment, and Training of Residents in Sonography. Medizinische Klinik 99, 117–122 (2004)CrossRefGoogle Scholar
  4. 4.
    Puppe, F., Ziegler, S., Martin, U., Hupp, J.: Wissensbasierte Diagnosesysteme im Service-Support (Diagnostic Knowledge Systems for the Service-Support). Springer, Heidelberg (2001)Google Scholar
  5. 5.
    Ohmann, C., et al.: Clinical Benefit of a Diagnostic Score for Appendicitis: Results of a Prospective Interventional Study. Archives of Surgery 134, 993–996 (1999)CrossRefGoogle Scholar
  6. 6.
    Miller, R., Pople, H.E., Myers, J.: Internist-1, an Experimental Computer-Based Diagnostic Consultant for General Internal Medicine. NEJM 307, 468–476 (1982)CrossRefGoogle Scholar
  7. 7.
    Neumann, M., Baumeister, J., Liess, M., Schulz, R.: An Expert System to Estimate the Pesticide Contamination of Small Streams using Benthic Macroinvertebrates as Bioindicators, Part 2. Ecological Indicators 2, 391–401 (2003)CrossRefGoogle Scholar
  8. 8.
    Tuzhilin, A.: Usefulness, Novelty, and Integration of Interestingness Measures. In: Klösgen, Z. (ed.) Handbook of Data Mining and Knowledge Discovery, ch. 19.2.2. Oxford University Press, New York (2002)Google Scholar
  9. 9.
    Freitas, A.A.: On Rule Interestingness Measures. Knowledge-Based Systems 12, 309–325 (1999)CrossRefGoogle Scholar
  10. 10.
    Lewis, D.D., Gale, W.A.: A Sequential Algorithm for Training Text Classifiers. In: Proc. of the 17th ACM International Conference on Research and Development in Information Retrieval (SIGIR 1994), London, pp. 3–12. Springer, Heidelberg (1994)Google Scholar
  11. 11.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)Google Scholar
  12. 12.
    Mitchell, T.: Machine Learning. McGraw-Hill Comp., New York (1997)MATHGoogle Scholar
  13. 13.
    Yen, S.J., Chen, A.L.P.: An Efficient Algorithm for Deriving Compact Rules from Databases. In: Ling, M. (ed.) Proceedings of the 4th International Conference on Database Systems for Advanced Applications 1995, pp. 364–371. World Scientific, Singapore (1995)Google Scholar
  14. 14.
    Baumeister, J., Atzmueller, M., Puppe, F.: Inductive Learning for Case-Based Diagnosis with Multiple Faults. In: Craw, S., Preece, A.D. (eds.) ECCBR 2002. LNCS (LNAI), vol. 2416, pp. 28–42. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Mateo (2000)Google Scholar
  16. 16.
    Puppe, F.: Knowledge Reuse Among Diagnostic Problem-Solving Methods in the Shell-Kit D3. Int. J. Human-Computer Studies 49, 627–649 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Martin Atzmueller
    • 1
  • Joachim Baumeister
    • 1
  • Frank Puppe
    • 1
  1. 1.Department of Computer ScienceUniversity of WürzburgWürzburgGermany

Personalised recommendations