Skip to main content

Networks Between Categorical or Discretized Numeric Variables

  • Chapter
  • First Online:
Weighted Network Analysis
  • 3505 Accesses

Abstract

Categorical variables take on non-numeric values, e.g., a discretized numeric variable can be interpreted as a categorical variable. Many association measures exist for measuring the statistical dependence between categorical variables (e.g., Pearson chi-square statistic, likelihood ratio test statistic, Fisher’s exact test, mutual information). Association measures between two discretized numeric vectors have been used to measure nonlinear dependencies between them. We describe several approaches for defining weighted networks among categorical variables. In particular, mutual information networks are often used for constructing gene networks. The close relationship between mutual information and a likelihood ratio test statistic allows us to define a conditional measure of mutual information, which accounts for additional covariates. Estimating the mutual information between numeric variables is rather challenging and involves parameter choices. We argue that in many applications the mutual information measure can be approximated by a correlation-based measure. We review the ARACNE approach for constructing an unweighted mutual information network and generalize it to correlation networks and other association networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Agresti A (2007) An introduction to categorical data analysis (wiley series in probability and statistics), 2nd edn. Wiley, New York

    Book  Google Scholar 

  • Beirlant J, Dudewica EJ, Gyofi L, Meulen E (1997) Nonparametric entropy estimation: An overview. Int J Math Stat Sci 6(1):17–39

    Google Scholar 

  • Butte AJ, Kohane IS (2000) Mutual information relevance networks: Functional genomic clustering using pairwise entropy measurments. Pac Symp Biocomput 5:418–429

    Google Scholar 

  • Butte A, Tamayo P, Slonim D, Golub T, Kohane I (2000) Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proc Natl Acad Sci USA 97:12182–12186

    Article  PubMed  CAS  Google Scholar 

  • Cheng J, Greiner R, Kelly J, Bell D, Liu W (2002) Learning bayesian networks from data: An information-theory based approach. Artif Intell 137(1–2):43–90

    Article  Google Scholar 

  • Chow C, Liu C (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans Inf Theory 14:462–467

    Article  Google Scholar 

  • Cover TM, Thomas JA (1991) Elements of information theory. Wiley, New York

    Book  Google Scholar 

  • Darbellay G, Vajda I (1999) Estimation of the information by an adaptive partitioning of the observation space. IEEE Trans Inf Theory 45:1315–1321

    Article  Google Scholar 

  • Daub CO, Steuer R, Selbig J, Kloska S (2004) Estimating mutual information using B-spline functions - an improved similarity measure for analysing gene expression data. BMC Bioinform 5(1):118

    Article  Google Scholar 

  • Fraser AM, Swinney HL (1986) Independent coordinates for strange attractors from mutual information. Phys Rev A 33(2):1134–1140

    Article  PubMed  Google Scholar 

  • Hausser J, Strimmer K (2008) Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks. J Mach Learn Res 10(July 2009):1469–1484

    Google Scholar 

  • Kraskov A, Stogbauer H, andrzejak R, Grassberger P (2003) Hierarchical clustering based on mutual information CoRR q-bio.QM/0311037

    Google Scholar 

  • Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Favera RD, Califano A (2006) ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinform 7(Suppl. 1):S7

    Article  Google Scholar 

  • Mason M, Fan G, Plath K, Zhou Q, Horvath S (2009) Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells. BMC Genomics 10(1):327

    Article  PubMed  Google Scholar 

  • Meyer P, Lafitte F, Bontempi G (2008) minet: A R/Bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinform 9(1):461

    Article  Google Scholar 

  • Nemenman I (2004) Information theory, multivariate dependence, and genetic network inference. Technical Report. NSF-KITP-04-54, KITP, UCSB. arXiv: q-bio/0406015

    Google Scholar 

  • Paninski L (2003) Estimation of entropy and mutual information. Neural Comput 15(6):1191–1253

    Article  Google Scholar 

  • Shannon CE (1948) A mathematical theory of communication. CSLI Publications, Stanford, CA

    Google Scholar 

  • Steuer R, Kurths J, Daub CO, Weise J, Selbig J (2002) The mutual information: Detecting and evaluating dependencies between variables. Bioinformatics 18(Suppl 2):S231–S240

    Article  PubMed  Google Scholar 

  • Wiggins C, Nemenman I (2003) Process pathway inference via time series analysis. Exp Mech 43(3):361–370

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steve Horvath .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Horvath, S. (2011). Networks Between Categorical or Discretized Numeric Variables. In: Weighted Network Analysis. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-8819-5_14

Download citation

Publish with us

Policies and ethics