Disease Models, Part I: Graphical Models

Shpitser, Ilya

doi:10.1007/978-1-4419-0385-3_8

Ilya Shpitser PhD³

2 Citations

Abstract

Scientists building models of the world by necessity abstract away features not directly relevant to their line of inquiry. Furthermore, complete knowledge of relevant features is not generally possible. The mathematical formalism that has proven to be the most successful at simultaneously abstracting the irrelevant, while effectively summarizing incomplete knowledge, is probability theory. First studied in the context of analyzing games of chance, probability theory has flowered into a mature mathematical discipline today whose tools, methods, and concepts permeate statistics, engineering, and social and empirical sciences. A key insight, discovered multiple times independently during the 20^th century, but refined, generalized, and popularized by computer scientists, is that there is a close link between probabilities and graphs. This link allows numerical, quantitative relationships such as conditional independence found in the study of probability to be expressed in a visual, qualitative way using the language of graphs. As human intuitions are more readily brought to bear in visual rather than algebraic and computational settings, graphs aid human comprehension in complex probabilistic domains. This connection between probabilities and graphs has other advantages as well - for instance the magnitude of computational resources needed to reason about a particular probabilistic domain can be read from a graph representing this domain. Finally, graphs provide a concise and intuitive language for reasoning about causes and effects. In this chapter, we explore the basic laws of probability, the relationship between probability and causation, the way in which graphs can be used to reason about probabilistic and causal models, and finally how such graphical models can be learned from data. The application of these graphs to formalize observations and knowledge about disease are provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We follow the standard notation of capitalizing random variables and of using lower case letters for outcomes. Bold characters symbolize sets or vectors of variables.
2.
Bayesian networks are also referred to as Bayesian belief networks (BBNs), or beliefnetworks. We use all three terms interchangeably throughout this book.
3.
These and other types of Bayesian queries are covered in further detail in Chapter 9.
4.
For causal models, described later in this chapter, this process is called inductive causal inference (also known as causal discovery), which is learning the causal graph from data.
5.
Briefly, the EM algorithm consists of two parts: the E-step, wherein the missing data are estimated using the conditional expectation, based on the observed data and the current estimate of the model parameters; and the M-step, where the likelihood function is maximized assuming the missing data are known (the estimated data from the E-step being used in lieu of the actual missing data).

References

Acid S, de Campos LM (2001) A hybrid methodology for learning belief networks: BENEDICT. Intl J Approximate Reasoning, 27(3):235-262.
Article MATH MathSciNet Google Scholar
Acid S, de Campos LM, Fernandez-Luna JM, Rodriguez S, Maria Rodriguez J, Luis Salcedo J (2004) A comparison of learning algorithms for Bayesian networks: A case study based on data from an emergency medical service. Artif Intell Med, 30(3):215-232.
Article Google Scholar
Andreassen S, Suojanen M, Falck B, Olesen K (2001) Improving the diagnostic performance of MUNIN by remodelling of the diseases. Artificial Intelligence in Medicine, pp 167-176.
Google Scholar
Andreassen S, Woldbye M, Falck B, Andersen SK (1987) MUNIN: A causal probabilistic network for interpretation of electromyographic findings. Proc 10th Intl Joint Conf on Artificial Intelligence, pp 366-372.
Google Scholar
Antal P, Fannes G, Timmerman D, Moreau Y, De Moor B (2004) Using literature and data to learn Bayesian networks as clinical models of ovarian tumors. Artif Intell Med, 30(3):257-281.
Article Google Scholar
Ash RB, Doleans-Dade CA (2000) Probability & Measure Theory. 2nd edition. Academic Press, San Diego, CA.
MATH Google Scholar
Balke A, Pearl J (1994) Counterfactual probabilities: Computational methods, bounds, and applications. Proc 10th Conf Uncertainty in Artificial Intelligence (UAI), pp 46-54.
Google Scholar
Balke A, Pearl J (1994) Probabilistic evaluation of counterfactual queries. Proc 12th American Assoc Artificial Intelligence (AAAI), pp 230-237.
Google Scholar
Brown LE, Tsamardinos I, Aliferis CF (2004) A novel algorithm for scalable and accurate Bayesian network learning. Stud Health Technol Inform, 107(Pt 1):711-715.
Google Scholar
Bryk AS, Raudenbush SW (1992) Hierarchical linear models: Applications and data analysis methods. Sage Publications, Newbury Park.
Google Scholar
Buchanan BG, Shortliffe EH (1984) Rule-based expert systems: The MYCIN experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading, Mass..
Google Scholar
Burnside ES, Rubin DL, Fine JP, Shachter RD, Sisney GA, Leung WK (2006) Bayesian network to predict breast cancer risk of mammographic microcalcifications and reduce number of benign biopsy results: Initial experience. Radiology, 240(3):666-673.
Article Google Scholar
Carrerira-Perpinan MA (1997) A review of dimension reduction techniques (Technical Report). Dept Computer Science, University of Sheffield. www.dcs.shef.ac.uk/intranet/re-search/resmes/CS9609.pdf . Accessed February 5, 2009.
Caruana R (2001) A non-parametric EM-style algorithm for imputing missing values. Proc 8th Intl Workshop Artificial Intelligence and Statistics, Key West, FL.
Google Scholar
Chickering DM (2002) Optimal structure identification with greedy search. J Mach Learn Res, 3:507-554.
Article MathSciNet Google Scholar
Chung KL (2001) A Course in Probability Theory Revised. 2nd edition. Academic Press, San Diego, CA.
Google Scholar
Cooper GF (1995) A Bayesian method for learning belief networks that contain hidden variables. J Intell Inf Sys, 4(1):71-88.
Article Google Scholar
Cooper GF (2000) A Bayesian method for causal modeling and discovery under selection. Proc 16th Conf Uncertainty in Artificial Intelligence (UAI), pp 98-106.
Google Scholar
Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9:309-347.
MATH Google Scholar
Coupé VM, Peek N, Ottenkamp J, Habbema JD (1999) Using sensitivity analysis for efficient quantification of a belief network. Artif Intell Med, 17(3):223-247.
Article Google Scholar
Darwiche A (2009) Modeling and reasoning with Bayesian networks. Cambridge University Press, New York.
MATH Google Scholar
Dawid AP (1979) Conditional independence in statistical theory. J Royal Statistical Society, 41(1):1-31.
MATH MathSciNet Google Scholar
Dekhtyar A, Goldsmith J, Goldstein B, Mathias KK, Isenhour C (2009) Planning for success: The interdisciplinary approach to building Bayesian models. International Journal of Approximate Reasoning, 50(3):416-428.
Article Google Scholar
Dempster AP, Laird M, Rubin D (1977) Maximum likelihood from incomplete data using the EM algorithm. J Royal Statistical Society, 39(1):1-38.
MATH MathSciNet Google Scholar
Dojer N, Gambin A, Mizera A, Wilczynski B, Tiuryn J (2006) Applying dynamic Bayesian networks to perturbed gene expression data. BMC Bioinformatics, 7:249.
Article Google Scholar
Druzdel MJ, van der Gaag LC (2000) Building probabilistic networks: “Where do the numbers come from?” (Guest editorial). IEEE Trans Knowledge and Data Engineering, 12(4):481-486.
Article Google Scholar
Duda RO, Hart PE, Nilsson NJ (1976) Subjective Bayesian methods for rule-based inference systems. Proc Natl Computer Conf (AFIPS), pp 1075-1082.
Google Scholar
Fishelson M, Geiger D (2002) Exact genetic linkage computations for general pedigrees. Bioinformatics, 18(S1):189-198.
Google Scholar
Fishelson M, Geiger D (2004) Optimizing exact genetic linkage computations. J Comput Biol, 11(2-3):263-275.
Article Google Scholar
Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science, 303(5659):799-805.
Article Google Scholar
Friedman N, Linial M, Nachman I, Pe'er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol, 7(3-4):601-620.
Article Google Scholar
Greenland S (2003) Quantifying biases in causal models: Classical confounding vs collider-stratification bias. Epidemiology, 14(3):300-306.
Article Google Scholar
Haavelmo T (1943) The statistical implications of a system of simultaneous equations. Econometrica, 11:1-12.
Article MATH MathSciNet Google Scholar
Harrell FE, Jr., Lee KL, Mark DB (1996) Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med, 15(4):361-387.
Article Google Scholar
Heckerman D (1999) A tutorial on learning with Baysesian networks. In: Jordan M (ed) Learning in Graphical Models. MIT Press, Cambridge, MA.
Google Scholar
Heckerman DE, Horvitz EJ, Nathwani BN (1992) Toward normative expert systems: Part I. The Pathfinder project. Methods Inf Med, 31(2):90-105.
Google Scholar
Helman P, Veroff R, Atlas SR, Willman C (2004) A Bayesian network classification methodology for gene expression data. J Computational Biology, 11(4):581-615.
Article Google Scholar
Huang Y, Valtorta M (2006) Pearl's Calculus of intervention is complete. Proc 22nd Conf Uncertainty in Artificial Intelligence (UAI), pp 217-224.
Google Scholar
Kahn CE, Jr., Roberts LM, Shaffer KA, Haddawy P (1997) Construction of a Bayesian network for mammographic diagnosis of breast cancer. Comput Biol Med, 27(1):19-29.
Article Google Scholar
Kindermann R, Snell JL (1980) Markov Random Fields and their Applications. American Mathematical Society.
Google Scholar
Kline JA, Novobilski AJ, Kabrhel C, Richman PB, Courtney DM (2005) Derivation and validation of a Bayesian network to predict pretest probability of venous thromboembolism. Ann Emerg Med, 45(3):282-290.
Article Google Scholar
Kline RB (2005) Principles and Practice of Structural Equation Modeling. The Guilford Press, New York, NY.
Google Scholar
Lam W, Bacchus F (1994) Learning Bayesian belief networks: An approach based on the MDL principle. Computational Intelligence, 10(4):269-293.
Article Google Scholar
Lavrac N, Keravnou E, Zupan B (2000) Intelligent data analysis in medicine. In: Kent A, et al. (eds) Encyclopedia of Computer Science and Technology, vol 42, pp 113-157.
Google Scholar
Ledley RS, Lusted LB (1959) Reasoning foundations of medical diagnosis. Science, 130(3366):9-21.
Article Google Scholar
Leibovici L, Fishman M, Schonheyder HC, Riekehr C, Kristensen B, Shraga I, Andreassen S (2000) A causal probabilistic network for optimal treatment of bacterial infections. IEEE Trans Knowledge and Data Engineering, 12(4):517-528.
Article Google Scholar
Lewis D (1973) Counterfactuals. Harvard University Press, Cambridge, MA.
Google Scholar
Liu H, Hussain F, Tan CL, Dash M (2002) Discretization: An enabling technique. Data Mining and Knowledge Discovery, 6(4):393-423.
Article MathSciNet Google Scholar
Lucas PJ, Segaar RW, Janssens AR (1989) HEPAR: An expert system for the diagnosis of disorders of the liver and biliary tract. Liver, 9(5):266-275.
Google Scholar
Lucas PJ, van der Gaag LC, Abu-Hanna A (2004) Bayesian networks in biomedicine and healthcare. Artif Intell Med, 30(3):201-214.
Article Google Scholar
Luciani D, Marchesi M, Bertolini G (2003) The role of Bayesian networks in the diagnosis of pulmonary embolism. J Thromb Haemost, 1(4):698-707.
Article Google Scholar
Meyer J, Phillips MH, Cho PS, Kalet I, Doctor JN (2004) Application of influence diagrams to prostate intensity-modulated radiation therapy plan selection. Phys Med Biol, 49(9):1637-1653.
Article Google Scholar
Monti S, Carenini G (2000) Dealing with the expert inconsistency in probability elicitation. IEEE Trans Knowledge and Data Engineering, 12(4):499-508.
Article Google Scholar
Monti S, Cooper GF (1998) A multivariate discretization method for learning Bayesian networks from mixed data. Proc 14th Conf Uncertainty in Artificial Intelligence (UAI), pp 404–413.
Google Scholar
Murphy K (2002) Dynamic Bayesian networks: Representation, inference, and learning. Department of Computer Science, PhD dissertation. University of California, Berkeley.
Google Scholar
Neapolitan RE (2003) Chapter 8, Bayesian structure learning. Learning Bayesian Networks. Prentice Hall, London.
Google Scholar
Neil M, Fenton N, Nielson L (2000) Building large-scale Bayesian networks. The Knowledge Engineering Review, 15(3):257-284.
Article MATH Google Scholar
Neyman J (1923) Sur les applications de la thar des probabilities aux expereince agaricales: Essay des principles. (Excerpts reprinted and translated to English, 1990). Statistical Science, 5:463-472.
MathSciNet Google Scholar
Nikiforidis GC, Sakellaropoulos GC (1998) Expert system support using Bayesian belief networks in the prognosis of head-injured patients of the ICU. Med Inform, 23(1):1-18.
Article Google Scholar
O'Hagan A, al. E (2006) Uncertain Judgements: Eliciting Experts' Probabilities. John Wiley & Sons, London.
Book MATH Google Scholar
Ogunyemi OI, Clarke JR, Ash N, Webber BL (2002) Combining geometric and probabilistic reasoning for computer-based penetrating-trauma assessment. J Am Med Inform Assoc, 9(3):273-282.
Article Google Scholar
Onisko A (2003) Probabilistic causal models in medicine: Application to diagnosis in liver disorders. Institute of Biocybernetics and Biomedical Engineering, PhD dissertation. Polish Academy of Science.
Google Scholar
Parker RC, Miller RA (1987) Using causal knowledge to create simulated patient cases: The CPCS Project as an extension of INTERNIST-1. Proc Ann Symp Computer Applications in Medical Care, pp 473-480.
Google Scholar
Patil RS (1987) Causal reasoning in computer programs for medical diagnosis. Comp Methods and Programs in Biomedicine, 25(2):117-124.
Article MathSciNet Google Scholar
Pauker SG, Gorry GA, Kassirer JP, Schwartz WB (1976) Towards the simulation of clinical cognition: Taking a present illness by computer. Am J Med, 60(7):981-996.
Article Google Scholar
Pearl J (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, San Mateo, CA.
Google Scholar
Pearl J (2000) Causality: Models, Reasoning, and Inference. Cambridge University Press, New York.
MATH Google Scholar
Pople H (1977) The formation of composite hypotheses in diagnostic problem solving: An exercise in synthetic reasoning. Proc 5th Intl Joint Conf Artificial Intelligence, Cambridge, MA, pp 1030-1037.
Google Scholar
Pople H (1982) Heuristic methods for imposing structure on ill-structured problems: The structuring of medical diagnostics. In: Szolovits P (ed) Artificial Intelligence in Medicine. Westview Press, Boulder, CO, pp 119-190.
Google Scholar
Press SJ (2003) Subjective and Objective Bayesian Statistics: Principles, Models, and Applications. John Wiley & Sons, Hoboken, NJ.
Google Scholar
Price GJ, McCluggage WG, Morrison MM, McClean G, Venkatraman L, Diamond J, Bharucha H, Montironi R, Bartels PH, Thompson D, Hamilton PW (2003) Computerized diagnostic decision support system for the classification of preinvasive cervical squamous lesions. Hum Pathol, 34(11):1193-1203.
Article Google Scholar
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE, 77(2):257-286.
Article Google Scholar
Reiter R (1980) A logic for default reasoning. Artificial Intelligence, 13:81-132.
Article MATH MathSciNet Google Scholar
Reiter R (1981) On interacting defaults. Proc 4th Intl Joint Conf Artificial Intelligence (IJCAI), pp 270-276.
Google Scholar
Richardson T, Spirtes P (2002) Ancetral graph Markov models. Annals of Statistics, 30:962-1030.
Article MATH MathSciNet Google Scholar
Riva A, Bellazzi R (1996) Learning temporal probabilistic causal models from longitudinal data. Artif Intell Med, 8(3):217-234.
Article Google Scholar
Robins JM (1987) A graphical approach to the identification and estimation of causal prameters in mortality studies with sustained exposure periods. J Chronic Disease, 2:139-161.
Article Google Scholar
Rubin D (1974) Estimating causal effects of treatments in randomized and non-randomized studies. J Educational Psychology, 66:688-701.
Article Google Scholar
Rubin DB (1997) Estimating causal effects from large data sets using propensity scores. Ann Intern Med, 127(8 Pt 2):757-763.
Google Scholar
Schafer JL, Olsen MK (1998) Multiple imputation for multivariate missing-data problems: A data analyst's perspective. Multivariate Behavioral Research, 33:545-571.
Article Google Scholar
Shpitser I, Pearl J (2006) Identification of conditional interventional distributions. Proc 22nd Conf Uncertainty in Artificial Intelligence (UAI).
Google Scholar
Shpitser I, Pearl J (2006) Identification of joint interventional distributions in recursive semi-Markovian causal models. Proc 21st National Conf Artificial Intelligence, p 1219.
Google Scholar
Shpitser I, Pearl J (2007) What counterfactuals can be tested. Proc 23rd Conf Uncertainty in Artificial Intelligence (UAI).
Google Scholar
Shwe MA, Middleton B, Heckerman DE, Henrion M, Horvitz EJ, Lehmann HP, Cooper GF (1991) Probabilistic diagnosis using a reformulation of the INTERNIST-1/QMR knowledge base. Part I: The probabilistic model and inference algorithms. Methods Inf Med, 30(4):241-255.
Google Scholar
Spirtes P, Glymour C, Scheines R (1993) Causation, Prediction, and Search. Springer, New York, NY.
MATH Google Scholar
Spirtes P, Glymour C, Scheines R, al. E (2001) Constructing Bayesian network models of gene expression networks from microarray data. Proc Atlantic Symp Computational Biology, Duke University.
Google Scholar
Spirtes P, Meek C, Richardson T (1995) Causal inference in the presence of latent variables and selection bias. Proc 11th Conf Uncertainty in Artificial Intelligence (UAI), pp 499-506.
Google Scholar
Suzuki J (1993) A construction of Bayesian networks from databases based on an MDL scheme. Proc Conf Uncertainty in Artificial Intelligence (UAI), pp 266-273.
Google Scholar
Tabachneck-Schijf HJM, Geenen PL (2009) Preventing knowledge transfer errors: Probabilistic decision support systems through the users' eyes. International Journal of Approximate Reasoning, 50(3):461-471.
Article Google Scholar
Tenenbaum JB, da Silva V, Landford JC (2000) A global framework for nonlinear dimensionality reduction. Science, 29:2319-2321.
Article Google Scholar
Tian J, Pearl J (2000) Probabilities of causation: Bounds and identification. Annals of Mathematics and Artificial Intelligence, 28(1):287-313.
Article MATH MathSciNet Google Scholar
Tinbergen J (1937) An Econometric Approach to Business Cycle Problems. Hermann Publishers, Paris, France.
Google Scholar
Tsamardinos I, Brown L, Aliferis C (2006) The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 65(1):31-78.
Article Google Scholar
van der Gaag LC, Tabachneck-Schijf HJM, Geenen PL (2009) Verifying monotonicity of Bayesian networks with domain experts. Intl J Approximate Reasoning, 50(3):429-436.
Article Google Scholar
van der Maaten LJP, Postma EO, van den Jerik HJ (2007) Dimensionality reduction: A comparative review. Maastricht University. http://tsam-fich.wdfiles.com/local-files/apunt-es/TPAMI_Paper.pdf . Accessed February 5, 2009.
Vapnik VN (1998) Statistical Learning Theory. Wiley, New York.
MATH Google Scholar
Verma TS, Pearl J (1990) Equivalence and synthesis of causal models (Technical Report). Computer Science Department, UCLA.
Google Scholar
Weiss S, Kulikowski C, Amarel S, Safir A (1978) A model-based method for computer-aided medical decision making. Artificial Intelligence, 11(2):145-172.
Article Google Scholar
Witteman CL, Renooij S, Koele P (2007) Medicine in words and numbers: A cross-sectional survey comparing probability assessment scales. BMC Med Inform Decis Mak, 7:13-21.
Article Google Scholar
Wright S (1921) Correlation and causation. J Agricultural Research, 20(7):557-585.
Google Scholar
Wu X, Lucas P, Kerr S, Dijkhuizen R (2001) Learning Bayesian network topologies in realistic medical domains. Proc 2nd Intl ACM Symp Medical Data Analysis, pp 302-308.
Google Scholar
Xiang Y, Pant B, Eisen A, Beddoes MP, Poole D (1993) Multiply sectioned Bayesian networks for neuromuscular diagnosis. Artif Intell Med, 5(4):293-314.
Article Google Scholar
Yang Y, Webb GI (2002) A comparative study of discretization methods for naive-Bayes classifiers. Proc Pacific Rim Knowledge Acquisition Workshop (PKAW), pp 159-173.
Google Scholar
Yu J, Smith VA, Wang PP, Hartemink AJ, Jarvis ED (2004) Advances to Bayesian network inference for generating causal networks from observational biological data. Bioinformatics, 20(18):3594-3603.
Article Google Scholar
Zhang J (2006) Causal inference and reasoning in causally insufficient systems. Department of Philosophy, PhD dissertation. Carnegie Mellon University.
Google Scholar
Zhao W, Serpedin E, Dougherty ER (2006) Inferring gene regulatory networks from time series data using the minimum description length principle. Bioinformatics, 22(17):2129-2135.
Article Google Scholar
Zou M, Conzen SD (2005) A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics, 21(1):71-79.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Public Health, Harvard University, Harvard, USA
Ilya Shpitser PhD

Authors

Ilya Shpitser PhD
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Medical Imaging Informatics Group, University of California, Los Angeles, Westwood Blvd. 924 , Los Angeles, 90024, U.S.A.
Alex A.T. Bui
Medical Imaging Informatics Group, University of California, Los Angeles, Westwood Blvd. 924 , Los Angeles, 90024, U.S.A.
Ricky K. Taira

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Shpitser, I. (2010). Disease Models, Part I: Graphical Models. In: Bui, A., Taira, R. (eds) Medical Imaging Informatics. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-0385-3_8

Download citation

DOI: https://doi.org/10.1007/978-1-4419-0385-3_8
Published: 10 October 2009
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-0384-6
Online ISBN: 978-1-4419-0385-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics