Shannon (1948) has shown that a source \(({\mathcal {U}},P,U)\) with output U satisfying Prob (U=u)=P u , can be encoded in a prefix code \({\mathcal{C}}=\{c_u:u\in{\mathcal {U}}\}\subset\{0,1\}^*\) such that for the entropy

\( H(P)=\sum\limits_{u\in{\mathcal {U}}}-p_u\log p_u\leq\sum p_u|| c_u|| \leq H(P)+1,\)

where || c u || is the length of c u .

We use a prefix code \(\mathcal{C}\) for another purpose, namely noiseless identification, that is every user who wants to know whether a u \((u\in{\mathcal {U}})\) of his interest is the actual source output or not can consider the RV C with \(C=c_u=(c_{u_1},\dots,c_{u || c_u ||})\) and check whether C=(C 1,C 2,...) coincides with c u in the first, second etc. letter and stop when the first different letter occurs or when C=c u . Let \(L_{\mathcal{C}}(P,u)\) be the expected number of checkings, if code \(\mathcal{C}\) is used.

Our discovery is an identification entropy, namely the function

\(H_I(P)=2\left(1-\sum\limits_{u\in{\mathcal {U}}}P_u^2\right).\)

We prove that \(L_{\mathcal{C}}(P,P)=\sum\limits_{u\in{\mathcal {U}}}P_u\) \(L_{\mathcal{C}}(P,u)\geq H_I(P)\) and thus also that

\( L(P)=\min\limits_{\mathcal{C}}\max\limits_{u\in{\mathcal {U}}}L_{\mathcal{C}}(P,u)\geq H_I(P)\)

and related upper bounds, which demonstrate the operational significance of identification entropy in noiseless source coding similar as Shannon entropy does in noiseless data compression.

Also other averages such as \(\bar L_{\mathcal{C}}(P)=\frac1{|{\mathcal {U}}|} \sum\limits_{u\in{\mathcal {U}}}L_{\mathcal{C}}(P,u)\) are discussed in particular for Huffman codes where classically equivalent Huffman codes may now be different.

We also show that prefix codes, where the codewords correspond to the leaves in a regular binary tree, are universally good for this average.


Extended Node Huffman Code Decomposition Formula Codeword Length Satisfying Prob 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Shannon, C.E.: A mathematical theory of communication. Bell Syst. Techn. J. 27, 379–423 (1948)MATHMathSciNetGoogle Scholar
  2. 2.
    Huffman, D.A.: A method for the construction of minimum redundancy codes. In: Proc. IRE, vol. 40, pp. 1098–1101 (1952)Google Scholar
  3. 3.
    Ahlswede, R., Dueck, G.: Identification via channels. IEEE Trans. Inf. Theory 35(1), 15–29 (1989)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Ahlswede, R.: General theory of information transfer: updated, General Theory of Information Transfer and Combinatorics, a Special issue of Discrete Applied MathematicsGoogle Scholar
  5. 5.
    Ahlswede, R., Balkenhol, B., Kleinewächter, C.: Identification for sources. In: Ahlswede, R., Bäumer, L., Cai, N., Aydinian, H., Blinovsky, V., Deppe, C., Mashurian, H. (eds.) General Theory of Information Transfer and Combinatorics. LNCS, vol. 4123, pp. 51–61. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • R. Ahlswede
    • 1
  1. 1.Fakultät für MathematikUniversität BielefeldBielefeldGermany

Personalised recommendations