Skip to main content

Shannon Information as a Measure of Amino Acid Diversity

  • Conference paper
Theoretical and Experimental Insights into Immunology

Part of the book series: NATO ASI Series ((ASIH,volume 66))

Abstract

The Wu-Kabat (KW) measure of diversity is intended to relate antigen binding structure to residue position. Jores, et al (1) (JAM) generalized KW to resolve hypervariable regions in Tcell receptor ß chains. This set the stage for further improvements in diversity measurement. Use of Shannon information improves diversity measurement in several important ways. Several drawbacks to the KW measure, and also the JAM measure are:

  1. 1.

    The sampling variability of these measures is not available, so statistical comparison of diversity among residue sites is not done.

  2. 2.

    Both KW and JAM are unstable in the sense that a single new observation can cause substantial jumps in their values.

  3. 3.

    Neither KW nor JAM can represent the scope of amino acid diversity since they do not account for the proportions of all twenty amino acids.

The Shannon measure (H) addresses 1) greatly reduces 2) and is not subject to 3) We obtain the variance of H along with that of KW and JAM, and show that the coefficient of variation of H is much smaller than that of the other measures.

Knowledge of the variance at each site permits statistical assessment of differences in local diversity. For example, rapidly alternating peaks and valleys resembling noise might indicate structural properties present in a family of sequences. The Shannon measure is capable of being used globally as well as locally because it is additive. That is, H(X) is the diversity of locus X and H(Y) is that of Y, and if X and Y are independent, then the diversity of X and Y taken together is H(X) + H(Y). As a result, diversity may be assigned not only to a site but also to a family of sequences. Dependent sites require special treatment.

The Shannon measure is obtained by first estimating the multinomial proportions of the 20 amino acids. These estimates, p1,…, p20 are then used to find \( - \sum\nolimits_{i = 1}^{20} {{p_i}{{\log}_2}({p_i})} \). The estimated probabilities lead to a direct assessment of the variance of the Shannon measure and to a Monte Carlo method for finding the variance of any measure dependent on the multinomial. Graphs, in the KW format, of the three measures for mouse V k chains including 95% confidence bounds are presented. The Shannon measure shows a much broader distribution of diversity than either the KW or JAM measures for the mouse data. This and its lower noise level may be importance in relating diversity to protein structure. Its usefulness extends to identification of conserved as well as highly diverse residues.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jores, R., Alzari,P.M., and Meo, T. Resolution of hypervariable regions in T-cell receptor beta chains by a modified Wu-Kabat index of amino acid diversity. Proc. Natl. Acad. Sci. USA, Vol. 87, Dec. 1990.

    Google Scholar 

  2. Shannon, C.E. The mathematical theory of communication. University of Illinois Press: Urbana, Illinois, 1949.

    Google Scholar 

  3. Wu, T.T. and Rabat, E.A. An Analysis of the Sequences of the Variable Regions of Bence Jones Proteins and Myeloma Light Chains and Their Implications for Antibody Complementarity. J. Exp. Med. 132, pp. 211–250, 1970.

    Article  PubMed  CAS  Google Scholar 

  4. Nei, M. Analysis of gene diversity in subdivided populations. Proc. Natl. Acad. Sci. USA. Vol. 70, No. 12, Part I, pp. 3321–3323, Dec. 1973.

    Article  PubMed  CAS  Google Scholar 

  5. Strohal, R., Helmberg, A., Kroemer, G., and Kofler, R. Mouse VK gene classification by nucleic acid sequence similarity. Immunogenetics vol 30: 425–493, 1989.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Litwin, S., Jores, R. (1992). Shannon Information as a Measure of Amino Acid Diversity. In: Perelson, A.S., Weisbuch, G. (eds) Theoretical and Experimental Insights into Immunology. NATO ASI Series, vol 66. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-76977-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-76977-1_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-76979-5

  • Online ISBN: 978-3-642-76977-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics