On the Behaviour of Information Measures for Test Selection

  • Danielle Sent
  • Linda C. van der Gaag
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4594)

Abstract

In diagnostic decision-support systems, a test-selection facility serves to select tests that are expected to yield the largest decrease in the uncertainty about a patient’s diagnosis. For capturing diagnostic uncertainty, often an information measure is used. In this paper, we study the Shannon entropy, the Gini index, and the misclassification error for this purpose. We argue that for a large range of values, the first derivative of the Gini index can be regarded as an approximation of the first derivative of the Shannon entropy. We also argue that the differences between the derivative functions outside this range can explain different test sequences in practice. We further argue that the misclassification error is less suited for test-selection purposes as it is likely to show a tendency to select tests arbitrarily. Experimental results from using the measures with a real-life probabilistic network in oncology support our observations.

Keywords

Shannon entropy Gini index misclassification error test selection 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Andreassen, S.: Planning of therapy and tests in causal probabilistic networks. Artificial Intelligence in Medicine 4, 227–241 (1992)CrossRefGoogle Scholar
  2. 2.
    Ben-Bassat, M.: Myopic policies in sequential classification. IEEE Transactions on Computers 27(2), 170–174 (1978)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Breiman, L.: Technical note: some properties of splitting criteria. Machine Learning 24, 41–47 (1996)MATHMathSciNetGoogle Scholar
  4. 4.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadworth & Brooks, Pacific Grove (1984)MATHGoogle Scholar
  5. 5.
    Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning. Springer, New York (2001)MATHGoogle Scholar
  6. 6.
    Hilden, J., Glasziou, P.: Test selection measures. Medical Decision Making 9, 133–141 (1989)CrossRefGoogle Scholar
  7. 7.
    van der Gaag, L.C., Renooij, S., Witteman, C.L.M., Aleman, B.M.P., Taal, B.G.: Probabilities for a probabilistic network: A case-study in oesophageal cancer. Artificial Intelligence in Medicine 25(2), 123–148 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Danielle Sent
    • 1
  • Linda C. van der Gaag
    • 2
  1. 1.Department of Electrical Engineering, Mathematics and Computer Science, University of Twente, P.O. Box 217, 7500 AE EnschedeThe Netherlands
  2. 2.Department of Information and Computing Sciences, Utrecht University, P.O. Box 80.089, 3508 TB UtrechtThe Netherlands

Personalised recommendations