Abstract
The Wu-Kabat (KW) measure of diversity is intended to relate antigen binding structure to residue position. Jores, et al (1) (JAM) generalized KW to resolve hypervariable regions in Tcell receptor ß chains. This set the stage for further improvements in diversity measurement. Use of Shannon information improves diversity measurement in several important ways. Several drawbacks to the KW measure, and also the JAM measure are:
-
1.
The sampling variability of these measures is not available, so statistical comparison of diversity among residue sites is not done.
-
2.
Both KW and JAM are unstable in the sense that a single new observation can cause substantial jumps in their values.
-
3.
Neither KW nor JAM can represent the scope of amino acid diversity since they do not account for the proportions of all twenty amino acids.
The Shannon measure (H) addresses 1) greatly reduces 2) and is not subject to 3) We obtain the variance of H along with that of KW and JAM, and show that the coefficient of variation of H is much smaller than that of the other measures.
Knowledge of the variance at each site permits statistical assessment of differences in local diversity. For example, rapidly alternating peaks and valleys resembling noise might indicate structural properties present in a family of sequences. The Shannon measure is capable of being used globally as well as locally because it is additive. That is, H(X) is the diversity of locus X and H(Y) is that of Y, and if X and Y are independent, then the diversity of X and Y taken together is H(X) + H(Y). As a result, diversity may be assigned not only to a site but also to a family of sequences. Dependent sites require special treatment.
The Shannon measure is obtained by first estimating the multinomial proportions of the 20 amino acids. These estimates, p1,…, p20 are then used to find \( - \sum\nolimits_{i = 1}^{20} {{p_i}{{\log}_2}({p_i})} \). The estimated probabilities lead to a direct assessment of the variance of the Shannon measure and to a Monte Carlo method for finding the variance of any measure dependent on the multinomial. Graphs, in the KW format, of the three measures for mouse V k chains including 95% confidence bounds are presented. The Shannon measure shows a much broader distribution of diversity than either the KW or JAM measures for the mouse data. This and its lower noise level may be importance in relating diversity to protein structure. Its usefulness extends to identification of conserved as well as highly diverse residues.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jores, R., Alzari,P.M., and Meo, T. Resolution of hypervariable regions in T-cell receptor beta chains by a modified Wu-Kabat index of amino acid diversity. Proc. Natl. Acad. Sci. USA, Vol. 87, Dec. 1990.
Shannon, C.E. The mathematical theory of communication. University of Illinois Press: Urbana, Illinois, 1949.
Wu, T.T. and Rabat, E.A. An Analysis of the Sequences of the Variable Regions of Bence Jones Proteins and Myeloma Light Chains and Their Implications for Antibody Complementarity. J. Exp. Med. 132, pp. 211–250, 1970.
Nei, M. Analysis of gene diversity in subdivided populations. Proc. Natl. Acad. Sci. USA. Vol. 70, No. 12, Part I, pp. 3321–3323, Dec. 1973.
Strohal, R., Helmberg, A., Kroemer, G., and Kofler, R. Mouse VK gene classification by nucleic acid sequence similarity. Immunogenetics vol 30: 425–493, 1989.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1992 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Litwin, S., Jores, R. (1992). Shannon Information as a Measure of Amino Acid Diversity. In: Perelson, A.S., Weisbuch, G. (eds) Theoretical and Experimental Insights into Immunology. NATO ASI Series, vol 66. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-76977-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-76977-1_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-76979-5
Online ISBN: 978-3-642-76977-1
eBook Packages: Springer Book Archive