PKDD 1999: Principles of Data Mining and Knowledge Discovery pp 323-328 | Cite as
Combining Data and Knowledge by MaxEnt-Optimization of Probability Distributions
Abstract
We present a project for probabilistic reasoning based on the concept of maximum entropy and the induction of probabilistic knowledge from data. The basic knowledge source is a database of 15000 patient records which we use to compute probabilistic rules. These rules are combined with explicit probabilistic rules from medical experts which cover cases not represented in the database. Based on this set of rules the inference engine PIT (Probability Induction Tool), which uses the well-known principle of Maximum Entropy [5], provides a unique probability model while keeping the necessary additional assumptions as minimal and clear as possible. PIT is used in the medical diagnosis project Lexmed [4] for the identification of acute appendicitis. Based on the probability distribution computed by PIT, the expert system proposes treatments with minimal average cost. First clinical performance results are very encouraging.
Keywords
Maximum Entropy Acute Appendicitis Average Cost Multivariate Statistics Inference EngineReferences
- 1.De Dombal: Diagnosis of Acute Abdominal Pain. Churchill Livingstone (1991)Google Scholar
- 2.Hontschik, B.: Theorie und Praxis der Appendektomie. Mabuse Verlag (1994)Google Scholar
- 3.Jaynes, E.T.: Concentration of distributions at entropy maxima. In: Rosenkrantz (ed.) Papers on Probability, Statistics and statistical Physics. D. Reidel Publishing Company (1982)Google Scholar
- 4.Homepage of lexmed (1999), http://lexmed.fh-weingarten.de
- 5.Paris, J.B., Vencovska, A.: A Note on the Inevitability of Maximum Entropy. International lournal of Approximate Reasoning 3, 183–223 (1990)CrossRefMathSciNetGoogle Scholar
- 6.Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Francisco (1988)Google Scholar
- 7.Quinlan, J.R.: C4-5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA (1993), C5.0, online available at http://www.rulequest.com
- 8.Schramm, M., Ertel, W.: Reasoning with Probabilities and Maximum Entropy: The System PIT and its Application in LEXMED. Accepted at Symposium on Operations Research 1999 (1999)Google Scholar
- 9.Whittaker, J.: Graphical Models in applied multivariate Statistics. John Wiley, Chichester (1990)MATHGoogle Scholar