Date: 27 May 2005

Specification, synchronisation, a verage length

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Consider a source providing finite sequences of symbols, zeros and ones in general, into a constrained format dictated by the data processor ; constraints are supposed to be time independant (of particular importance are those specified by a finite list of forbidden blocks of symbols, e.g. upper and lower bounds on runs length of zeros and ones). Factorial extendables languages are an appropriate mathematical model for strings of symbols, and for strings long enough to seem infinite we have to deal with entropy (Shannon) if we want to measure the quantities of information dispatched by the source : the greatest the entropy is, the greatest the information will be ; roughly speaking, the entropy is the exponential growth of the number of strings of n letters enclosed in the strings provided by the source.

A particular case is that of a source sending successive blocks m1, m2, ... randomly chosen in a set X of words on an alphabet A ; we obtain messages of the type m1 m2 ... mn ; we call X a variable length code if a word on X has only one decomposition on X ; if an is the number of words of n letters in X then the entropy of the associated phenomen is equal to log(1/r) where r is the positive real number so that : 1=a1/r+a2/r2+a3/r3+...; if you pick up a block m in the sequence dispatched by the source, the average of its length is (Berstel-Perrin) a=a1/r+2a2/r2+3a3/r3+... who is called the average length of the code ; for a given entropy, the smaller the average length is, the more economic the transmission will be and so we investigate the problem : how to minimize the average length of a code for a given entropy ?