1 Introduction

Krawtchouk polynomials are part of the legacy of Mikhail Kravchuk (Krawtchouk), see [19] as a valuable resource about his life and work, including developments up through 2004 based on his work. Krawtchouk polynomials appear in diverse areas of mathematics and science. Important applications such as to image processing may be found in [20].

We cite [1, 16, 18] where Krawtchouk polynomials are used as the foundation for discrete models of quantum physics. And they appear naturally when studying random walks in quantum probability [7, 8].

Multivariate Krawtchouk polynomials play an increasingly important role, in mathematical physics as well as probability and statistics [4, 11]. The approach of [21] on composition Markov chains ties in with the multivariate polynomials as well, cf. [14].

After this Introduction, we continue with the probabilistic construction of Krawtchouk polynomials. They appear as the elementary symmetric functions in the jumps of a random walk, providing a system of martingales based on the random walk. Some fundamental recurrence relations are presented as well. The construction immediately yields their orthogonality relations. Alternative probabilistic approaches to ours of Sect. 2 are to be found in [2, 13,14,15].

Section 3 provides the linearization and convolution formulas that are the core of the paper. They are related to formulas found in [10, 17]. The next section, Sect. 4, specializes to the case of a symmetric random walk, where the formulas simplify considerably.

In Sect.5 we introduce shift operators and use them to develop a computationally effective approach to finding transforms and convolutions. This differs from our principal work with operator calculus [6] and recent approach to Krawtchouk transforms [9] and is suitable for numerical as well as symbolic computations.

The article concludes with Sect. 6 which presents special bases in which the Krawtchouk matrices are anti-diagonal. These basis functions have limited support and look to be useful in implementing filtering methods in the Krawtchouk setting.

2 Combinatorial and probabilistic basis: main features

Consider a collection of N bits \(B=\{0,1\}\) or signs \(S=\{-1,1\}\). Correspondingly, we let j denote the number of 0’s or \(-1\)’s. And we denote the sum in either case by x. So \(x=N-j\) for bits, \(x=N-2j\) for signs. Order the elements of B or S and denote them by \(X_i\), \(1\le i\le N\). We can encode this information in the generating function

$$\begin{aligned} G_N(v)=\prod _{i=1}^N (1+vX_i) \end{aligned}$$

Now introduce a binomial probability space with the \(X_i\) a sequence of independent, identically distributed Bernoulli variables. With p the probability of “success”, \(q=1-p\), the centered random variables are distributed as follows:

Bits: \(\displaystyle X_i-p={\left\{ \begin{array}{ll}\ \ q,&{}\text {with probability }p\\ -p,&{}\text {with probability }q \end{array}\right. } \)

Signs: \(\displaystyle X_i-(p-q)={\left\{ \begin{array}{ll}\ \ 2q,&{}\text {with probability }p\\ -2p,&{}\text {with probability }q \end{array}\right. } \)

To get a sequence of orthogonal functionals of the process we redefine

$$\begin{aligned} G_N(v)=\prod _{i=1}^N (1+v(X_i-\mu ))=\sum _n v^n\,k_n(j,N) \end{aligned}$$
(1)

where \(\mu \) is the expected value of \(X_i\).

Remark

Note that this effectively defines the polynomials \(k_n(j,N)\) as the elementary symmetric functions in the variables \(X_i-\mu \). See [12] where this is extended to the multivariate case and [3] where this viewpoint plays an important role in the study of positivity results for bilinear sums of Krawtchouk polynomials.

We see that the two cases differ effectively as a rescaling of v. To see how this comes about, consider general Bernoulli variables \(X_i\) taking values a and b with probabilities p and q respectively. Then the centered variables take values

$$\begin{aligned}{\left\{ \begin{array}{ll} a-(pa+qb)=\lambda q, &{}\text {with probability }p\\ b-(pa+qb)=-\lambda p, &{}\text {with probability }q \end{array}\right. } \end{aligned}$$

where \(\lambda =a-b\). We can take as standard model \(b=0\) and \(a=\lambda \). Then

$$\begin{aligned} \mu =\lambda p\quad \text {and}\quad \sigma ^2=\lambda ^2pq \end{aligned}$$

are the mean and variance of \(X_i\). Thus, G has the form

$$\begin{aligned} G_N(v)=(1+\lambda qv)^{N-j}(1-\lambda pv)^j=\sum _n v^n\,k_n(j,N) \end{aligned}$$

with j counting the number of 0’s and

$$\begin{aligned} k_n(j,N)=\lambda ^n\,\sum _i \left( {\begin{array}{c}N-j\\ n-i\end{array}}\right) \left( {\begin{array}{c}j\\ i\end{array}}\right) (-1)^i p^i q^{n-i} \end{aligned}$$

These are polynomials in the variable j, Krawtchouk polynomials. We define a corresponding matrix

$$\begin{aligned} \Phi _{ij}^{(N)}=k_i(j,N) \end{aligned}$$

which acts as a transformation on \(\mathbb R^{N+1}\), which we consider as the space of functions defined on the set \(\{0,1,\ldots ,N\}\). The generic form, Eq. (1), is convenient for revealing and proving properties of the Krawtchouk polynomials, and of the transform \(\Phi \).

We review here some principal features of this construction [7, 8].

Remark

Denote expectation with respect to the underlying binomial distribution with angle brackets:

$$\begin{aligned} \langle f(X) \rangle =\sum _j \left( {\begin{array}{c}N\\ j\end{array}}\right) f(j)\,p^{N-j}q^j \end{aligned}$$

and corresponding inner product \(\langle f,g \rangle =\langle f(X)g(X) \rangle \) .

2.1 Martingale property

Since the \(X_i\) are independent and \(X_i-\mu \) has mean zero, we have the martingale property

$$\begin{aligned} E(G_{N+1}|{\mathcal F}_N)=\langle (1+v(X_{N+1}-\mu ) \rangle \, G_N=G_N \end{aligned}$$

where \({\mathcal F}_N\) is the \(\sigma \)-field generated by \(\{X_1,\ldots ,X_N\}\). Thus each coefficient \(k_n(j,N)\) is a martingale, where j denotes the number of 0’s in the random sequence of 0’s and 1’s which is the sample path of the underlying Bernoulli process. This gives immediately

Proposition 2.1

Martingale recurrence

$$\begin{aligned} k_n(j,N)=p\,k_n(j,N+1)+q\,k_n(j+1,N+1) \end{aligned}$$

One can derive this purely algebraically by the Pascal recurrences presented in the next paragraph.

2.2 Pascal recurrences and square identity

As is evident from the form of the generating function G, we have recurrences analogous to the Pascal triangle for binomial coefficients.

Proposition 2.2

Pascal recurrences

  1. 1.

    \(k_n(j,N+1)=k_n(j,N)+\lambda q\,k_{n-1}(j,N)\)

  2. 2.

    \(k_n(j+1,N+1)=k_n(j,N)-\lambda p\,k_{n-1}(j,N)\)

These follow directly, first considering \((1+\lambda qv)G_N(v)=G_{N+1}(v)\) and second

$$\begin{aligned} (1-\lambda pv)G_N(v)=G_{N+1}\bigm |_{j\rightarrow j+1}\ . \end{aligned}$$

Note that the martingale property follows by combining p times the first equation with q times the second.

Given four contiguous entries forming a \(2\times 2\) submatrix of \(\Phi ^{(N)}\), the square identity produces the lower left corner from the other three values. In terms of the k’s:

Proposition 2.3

Square identity

$$\begin{aligned} k_n( j,N)=\lambda p\,k_{n-1}( j,N)+\lambda q\,k_{n-1}( j+1,N)+k_n( j+1,N) \end{aligned}$$

Proof

Combine p times the first equation above with q times that same equation with \(j\rightarrow j+1\). Applying the martingale recurrence on the left-hand side yields

$$\begin{aligned} k_n( j,N)=p\,k_n( j,N)+q\,k_n( j+1,N)+\lambda pq\,k_{n-1}( j,N)+\lambda q^2\,k_{n-1}( j+1,N) \end{aligned}$$

Subtracting off \(p\,k_n( j,N)\) and dividing out a common factor of q yields the result. \(\square \)

2.3 Orthogonality

For orthogonality, we wish to show that \(\langle G(v)G(w) \rangle \) is a function of the product vw only. We have, using independence and centering,

$$\begin{aligned} \langle G(v)G(w) \rangle&=\prod \langle (1+(v+w)(X_i-\mu )+vw(X_i-\mu )^2) \rangle \\&=\prod (1+vw\,\sigma ^2)=(1+vw\,\sigma ^2)^N \end{aligned}$$

where the variance \(\sigma ^2=\lambda ^2 pq\) in our context. This yields the squared norms

$$\begin{aligned} \Vert k_n\Vert ^2=\langle k_n,k_n \rangle =\left( {\begin{array}{c}N\\ n\end{array}}\right) (\lambda ^2 pq)^n\ . \end{aligned}$$

Introducing matrices, we can express the orthogonality relations compactly. Let B, the binomial distribution matrix, be the diagonal matrix

$$\begin{aligned} B=\text {diag}\,\left( p^N,Np^{N-1}q,\ldots ,\left( {\begin{array}{c}N\\ i\end{array}}\right) \,p^{N-i}q^i,\ldots ,q^N\right) \end{aligned}$$

Let \(\Gamma \) denote the diagonal matrix of squared norms,

$$\begin{aligned} \Gamma =\text {diag}\,\left( 1,N(\lambda ^2 pq),\ldots ,\left( {\begin{array}{c}N\\ i\end{array}}\right) \,(\lambda ^2 pq)^i,\ldots ,(\lambda ^2 pq)^N\right) \end{aligned}$$

For fixed N, we write \(\Phi \) for \(\Phi ^{(N)}\) which has ij entry equal to \(k_i(j,N)\). Now \(G(v)=\sum v^i\Phi _{ij}\), and we have

$$\begin{aligned} \sum _{i,j} v^iw^j(\Phi B\Phi ^T)_{ij}&=\sum _{i,j,n}v^i w^j\Phi _{in}B_{nn}\Phi _{jn}\\&=\langle G(v)G(w) \rangle =\sum _n(vw)^n\Gamma _{nn}\ . \end{aligned}$$

In other words, the orthogonality relation takes the form

$$\begin{aligned} \Phi B\Phi ^T=\Gamma \end{aligned}$$

which gives for the inverse

$$\begin{aligned} \Phi ^{-1}=B\Phi ^T\Gamma ^{-1}\ . \end{aligned}$$

In the following sections we will detail linearization formulas for the symmetric and non-symmetric cases, derive the corresponding recurrence formulas and then look at the associated convolution operators on functions.

3 Krawtchouk polynomials: general case

We have the generating function

$$\begin{aligned} G(v)=(1+\lambda qv)^{N-j}(1-\lambda pv)^j=\sum _{0\le n\le N} v^n\,k_n(j,N) \end{aligned}$$

with j running from 0 to N. The main feature is the relation

$$\begin{aligned} G(v)=\prod (1+v(X_i-\mu )) \end{aligned}$$

where \(X_i\) are independent Bernoulli variables taking values \(\lambda \) and 0 with probabilities p and q respectively.

3.1 Linearization coefficients

We want the expansion of the product \(k_\ell k_m\) in terms of \(k_n\). First, a simple lemma

Lemma 3.1

Let X take values \(\lambda \) and 0. Then the identity

$$\begin{aligned} (X-\lambda p)^2=\lambda (q-p)(X-\lambda p)+\lambda ^2pq \end{aligned}$$

holds.

Proof

It is immediately checked. To derive it, expand \(x(x-\lambda )\) in Taylor series about \(\lambda p\) and equate the result to zero. \(\square \)

In our context, we can write this as

$$\begin{aligned} (X-\mu )^2=\lambda (q-p)(X-\mu )+\sigma ^2 \end{aligned}$$
(2)

Now multiply

$$\begin{aligned} G(v)G(w)&=\prod \bigl (1+(v+w)(X_i-\mu )+vw(X_i-\mu )^2\bigr )\\&=\prod \bigl (1+(v+w+\lambda (q-p)vw)(X_i-\mu )+\sigma ^2vw\bigr ) \end{aligned}$$

by the Lemma. Factoring out \(1+\sigma ^2vw\) from each term and re-expanding yields

$$\begin{aligned}&\sum _{\ell ,m} v^\ell w^m k_\ell (j,N)k_m(j,N)\nonumber \\&\quad =(1+\sigma ^2vw)^N\,\prod \left( 1+\frac{v+w+\lambda (q-p)vw}{1+\sigma ^2vw} (X_i-\mu ) \right) \nonumber \\&\quad =\sum _n(1+\sigma ^2vw)^{N-n}(v+w+\lambda (q-p)vw)^n\,k_n(j,N) \end{aligned}$$
(3)

Expanding the coefficient of \(k_n\), we have

$$\begin{aligned} \sum _{\alpha +\beta +\gamma =n,\delta }\left( {\begin{array}{c}n\\ \alpha ,\beta ,\gamma \end{array}}\right) \left( {\begin{array}{c}N-n\\ \delta \end{array}}\right) v^\alpha w^\beta (\lambda (q-p))^\gamma (\sigma ^2vw)^\delta \end{aligned}$$

Fixing

$$\begin{aligned} \ell =n-\beta +\delta \quad \text {and}\quad m=n-\alpha +\delta \end{aligned}$$

yields

Theorem 3.2

Linearization formula.

The coefficient of \(k_n\) in the expansion of the product \(k_\ell k_m\) is

$$\begin{aligned} \sum _\delta \frac{n!}{(n-m+\delta )!(n-\ell +\delta )!(\ell +m-n-2\delta )!}\,\left( {\begin{array}{c}N-n\\ \delta \end{array}}\right) (\lambda (q-p))^{\ell +m-n-2\delta } \sigma ^{2\delta }\ . \end{aligned}$$

Remark

This expansion is interesting as well in the study of bilinear sums of Krawtchouk polynomials, see [3, 5].

3.1.1 Recurrence formula

The three-term recurrence formula characteristic of orthogonal polynomials follows by specializing \(\ell =1\) in the linearization formula. First, compute the constant term and coefficient of v from the generating function G :

$$\begin{aligned} k_0=1\quad \text {and}\quad k_1=\lambda (Nq-j) \end{aligned}$$

From the linearization formula, we pick up three terms, with \(n=m\) and \(n=m\pm 1\). We get

Proposition 3.3

Recurrence formula

$$\begin{aligned} \lambda (Nq-j)\,k_m=(m+1)k_{m+1}+\lambda (q-p)\,m\,k_m+\lambda ^2pq(N+1-m)\,k_{m-1} \end{aligned}$$

The terms \(k_m\) and \(k_{m+1}\) arise with \(\delta =0\), with the term \(k_{m-1}\) the only contribution for \(\delta =1\).

3.2 Krawtchouk transforms: inversion

We identify functions on \(\{0,1,\ldots ,N\}\) with \(\mathbb R^{N+1}\) and the Krawtchouk transforms via the action of the matrix \(\Phi ^{(N)}\) on that space. For given N, \(\Phi \) denotes \(\Phi ^{(N)}\).

For our standard transform, we think of row vectors with multiplication by \(\Phi \) on the right. Thus, the transform F of a function f is given by

$$\begin{aligned} F(j)=\sum _n f(n)\,k_n(j,N) \qquad \text {or} \qquad \mathbf{F}^\dag =\mathbf{f}^\dag \,\Phi \end{aligned}$$

where, e.g., \(\mathbf{f}\) is the column vector with entries the corresponding values of f.

The inversion formula is conveniently expressed in terms of matrices.

Proposition 3.4

Let P be the diagonal matrix

$$\begin{aligned} P=\text{ diag }\,\left( (\lambda p)^N,\ldots ,(\lambda p)^{N-j},\ldots ,1\right) \end{aligned}$$

Let \(P'\) be the diagonal matrix

$$\begin{aligned} P'=\mathrm{diag}\,\left( 1,\ldots ,(\lambda p)^j,\ldots ,(\lambda p)^N\right) \end{aligned}$$

Then

$$\begin{aligned} \Phi P\Phi =\lambda ^NP' \end{aligned}$$

The proof is similar to that for orthogonality.

Proof

The matrix equation is the same as the corresponding identity via generating functions. Namely,

$$\begin{aligned} \sum _{i,j,n}v^i k_{i}(n,N)(\lambda p)^{N-n}k_{n}(j,N)w^j\left( {\begin{array}{c}N\\ j\end{array}}\right) =\lambda ^N(1+\lambda pvw)^N \end{aligned}$$

First, sum over i, using the generating function G(v), with j replaced by n. Then sum over n, again using the generating function. Finally, summing over j using the binomial theorem yields the desired result, via \(p+q=1\). \(\square \)

Thus,

Corollary 3.5

$$\begin{aligned} \Phi ^{-1}=\lambda ^{-N}\,P\Phi P'^{-1} \end{aligned}$$

which is the basis for an efficient inversion algorithm, being a simple modification of the original transform.

3.3 Convolution

Corresponding to the product of two transforms F and G is the convolution of the original functions f and g. We have, following the proof of the linearization formula, Eq. (3),

$$\begin{aligned} F(j)G(j)&=\sum _{\ell ,m}f(\ell )g(m)k_\ell (j)k_m(j)\\&=\sum _n k_n(j) \sum _{\alpha ,\beta ,\delta } \frac{n!}{\alpha !\beta !(n-\alpha -\beta )!}\,\left( {\begin{array}{c}N-n\\ \delta \end{array}}\right) \\&\quad \times (\lambda (q-p))^{n-\alpha -\beta }(\lambda ^2pq)^{\delta } f(n-\beta +\delta )g(n-\alpha +\delta ) \ . \end{aligned}$$

Thus, we may define the convolution of two functions f and g on \(\{0,1,\ldots ,N\}\) by

$$\begin{aligned} (f\star g)(n)&=\sum _{\alpha ,\beta ,\delta } \left( {\begin{array}{c}n\\ \alpha ,\beta ,n-\alpha -\beta \end{array}}\right) \,\left( {\begin{array}{c}N-n\\ \delta \end{array}}\right) \nonumber \\&\quad \times (\lambda (q-p))^{n-\alpha -\beta }(\lambda ^2pq)^{\delta } f(n-\beta +\delta )g(n-\alpha +\delta ) \end{aligned}$$
(4)

and we have the relation

$$\begin{aligned} F(j)G(j)=\sum _n (f\star g)(n)\,k_n(j). \end{aligned}$$

Now, using the inversion formula, Corollary 3.5, we have the relation

$$\begin{aligned} (f\star g)(n)=\lambda ^{-N}\,\sum _j F(j)G(j)\,(\lambda p)^{N-n-j}\,k_j(n) \end{aligned}$$

for the convolution of functions.

4 Krawtchouk polynomials: symmetric case

For the symmetric case, it is convenient to consider the “signs” process where \(X_i\) takes values \(\pm 1\) with equal probability, \(p=q=1/2\). Thus, \(\lambda =2\) and we have the generating function

$$\begin{aligned} G(v)=(1+2qv)^{N-j}(1-2pv)^j=(1+v)^{N-j}(1-v)^j=\sum _n v^n\,k_n(j,N) \ . \end{aligned}$$

While j runs from 0 to N, the sum \(x=N-2j\) runs from \(-N\) to N in steps of 2. Now, \(q-p=0\) and \(\sigma ^2=1\).

In terms of \(x=k_1=N-2j\), write \(k_n(j,N)=K_n(x,N)/n!\). We have the recurrence

$$\begin{aligned} x\,K_n=K_{n+1}+n(N+1-n)\,K_{n-1} \end{aligned}$$

with initial conditions \(K_0=1\), \(K_1=x\). For example, we can generate the next few polynomials

$$\begin{aligned} K_2=x^2-N,\quad K_3=x^3+(2-3N)x,\quad K_4=x^4+(8-6N)x^2+3N^2-6N . \end{aligned}$$

The special identities and recurrences hold with \(\lambda =2\), \(p=q=1/2\) and simplify accordingly. Of particular interest is the simplification of the convolution structure.

4.1 Linearization coefficients

We want the expansion of the product \(k_\ell k_m\) in terms of \(k_n\). In Theorem 3.2, since \(q=p\), we have the condition

$$\begin{aligned} \ell +m-n=2\delta \end{aligned}$$

and the sum over delta disappears. This leads to a particular set of conditions, namely, that the numbers \(\ell \), m, and n satisfy the conditions that they should form the sides of a triangle. So, define the triangle function

$$\begin{aligned} \Delta (\ell ,m,n)=\frac{ \left( \frac{\ell +m+n}{2}\right) !}{\left( \frac{-\ell +m+n}{2}\right) !\left( \frac{\ell -m+n}{2}\right) !\left( \frac{\ell +m-n}{2}\right) !} \end{aligned}$$

where all terms with a factorial must be nonnegative. Note that this is a multinomial coefficient.

Proposition 4.1

In the symmetric case, the expansion of the product \(k_\ell k_m\) is

$$\begin{aligned} k_\ell (j)\,k_m(j)=\sum \left( {\begin{array}{c}n\\ \frac{\ell -m+n}{2}\end{array}}\right) \left( {\begin{array}{c}N-n\\ \frac{\ell +m-n}{2}\end{array}}\right) \,k_n(j) \ . \end{aligned}$$

Alternatively, we have

$$\begin{aligned} k_\ell (j)\,k_m(j)=\sum \left( {\begin{array}{c}N\\ \frac{\ell +m+n}{2}\end{array}}\right) \frac{\Delta (\ell ,m,n)}{\left( {\begin{array}{c}N\\ n\end{array}}\right) }\,k_n(j) \ . \end{aligned}$$

Remark

If \(\ell +m\ge N\), then the two sides differ by a polynomial vanishing on the spectrum \(\{-N,2-N,\ldots ,N-2,N\}\).

Proof

The “triangular” form follows from the binomial form by rearranging factorials. \(\square \)

4.2 Krawtchouk transforms: inversion

In the symmetric case, the matrices P and \(P'\) in Proposition 3.4 and Corollary 3.5 become identity matrices. Thus, we have

$$\begin{aligned} \Phi ^2=2^NI\qquad \text {and}\qquad \Phi ^{-1}=2^{-N}\Phi . \end{aligned}$$

So the inversion is essentially an immediate application of the original transform.

Remark

We note here as well the duality property of suitably scaled Krawtchouk polynomials, invariance under interchange of the index and argument. In the matrix formulation this appears as the property that the matrix \(\Phi B\) is symmetric, where B is the diagonal matrix with the binomial coefficients along the diagonal.

4.3 Convolution

Corresponding to the product of two transforms F and G is the convolution of the original functions f and g. In Eq. (4), the condition \(q-p=0\) entails \(n=\alpha +\beta \). We write a for \(\alpha \), replacing \(\beta =n-a\) and write b for \(\delta \). This gives for the convolution

$$\begin{aligned} (f\star g)(n)=\sum _{a,b}\left( {\begin{array}{c}n\\ a\end{array}}\right) \left( {\begin{array}{c}N-n\\ b\end{array}}\right) f(a+b)g(n-a+b)\ . \end{aligned}$$

We have the relation

$$\begin{aligned} F(j)G(j)=\sum _n (f\star g)(n)\,k_n(j) \end{aligned}$$

and the inversion simplifies to

$$\begin{aligned} (f\star g)(n)=2^{-N}\,\sum _j F(j)G(j)\,k_j(n) \end{aligned}$$

for the convolution of the original functions.

5 Shift operators and matrix formulation of Krawtchouk transform and convolution

We will show how the transform and convolution can be represented by matrices acting on appropriate spaces.

5.1 Transforms

Introduce the shift operator \(T_x\) which acts on a function f(x) by

$$\begin{aligned} T_xf(x)=f(x+1) \end{aligned}$$

Similarly, \(T_yf(y)=f(y+1)\) shifts the variable y by 1. For the transform, in the generating function, we replace v by \(T_n\), the matrix shifting the argument n of f:

$$\begin{aligned} F(j)=\sum _n f(n)k_n(j)=\sum _n k_n(j)\,(T_n)^n f(0)=(1+\lambda qT_n)^{N-j}(1-\lambda p T_n)^jf(0) \end{aligned}$$

Representing f by the (column) vector of values

$$\begin{aligned} \mathbf{f} =(f(0),f(1),\ldots ,f(N))^\dag \end{aligned}$$

\(T_n\) is represented by the \(N+1\) by \(N+1\) matrix with 1’s on the superdiagonal and zeros elsewhere and the above formula can be computed recursively using matrices of a very simple form. The value F(j) will be the top entry in the resulting vector at each step.

One approach is to form

$$\begin{aligned} T(N)=(1+\lambda qT_n)^N\quad \text {and}\quad U=(1+\lambda qT_n)^{-1}(1-\lambda p T_n) \end{aligned}$$

and compute successively

$$\begin{aligned} T(N)\mathbf{f}\,,UT(N)\mathbf{f}\,,U^2T(N)\mathbf{f}\,,\ldots ,U^NT(N)\mathbf{f} \end{aligned}$$
(5)

Form a matrix with these vectors as columns. Then the entries along the top row are the values F(j).

Remark

Even though we are using the column vector \(\mathbf{f}\), we are taking the transform multiplying by \(\Phi \) on the right, that is, computing the entries of \(\mathbf{f}^\dag \Phi \).

Considering vectors \(\mathbf{f}\) with a single nonzero entry equal to one leads to another way to describe the result. Namely, the matrix with successive columns equal to the first row from each of the generated matrices \(U^jT(N)\) produces \(\Phi \). (See “Appendix”.)

Remark

Note that the matrix U has the expansion

$$\begin{aligned} U=I-\lambda T+\lambda ^2qT^2-\lambda ^3q^2T^3+\cdots \;\ . \end{aligned}$$

This follows from the identity

$$\begin{aligned} I-U=\lambda T(I+\lambda qT)^{-1} \end{aligned}$$
(6)

which may be verified by multiplying both sides by \((I+\lambda qT)\). Expanding in geometric series, noting that T is nilpotent, yields the above formula for U. The coefficients are the entries constant on successive superdiagonals of U.

Example

Let \(N=4\). We have

$$\begin{aligned} T(4)=(I+\lambda q T)^4= \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1&{}4\,\lambda \,q&{}6\,{\lambda }^{2}{q}^{2}&{} 4\,{\lambda }^{3}{q}^{3}&{}{\lambda }^{4}{q}^{4}\\ 0&{}1&{}4 \,\lambda \,q&{}6\,{\lambda }^{2}{q}^{2}&{}4\,{\lambda }^{3}{q}^{3} \\ 0&{}0&{}1&{}4\,\lambda \,q&{}6\,{\lambda }^{2}{q}^{2} \\ 0&{}0&{}0&{}1&{}4\,\lambda \,q\\ 0&{}0&{}0&{}0&{}1 \end{array} \right] \end{aligned}$$

and

$$\begin{aligned} U=(1+\lambda qT)^{-1}(1-\lambda p T)=\left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1&{}-\lambda &{}{\lambda }^{2}q&{}-{\lambda }^{3} {q}^{2}&{}{\lambda }^{4}{q}^{3}\\ 0&{}1&{}-\lambda &{}{\lambda } ^{2}q&{}-{\lambda }^{3}{q}^{2}\\ 0&{}0&{}1&{}-\lambda &{}{\lambda }^{2}q\\ 0&{}0&{}0&{}1&{}-\lambda \\ 0&{}0&{}0&{}0&{} 1\end{array} \right] \end{aligned}$$

Starting with a column vector \(\mathbf{f}\), first multiplying by T(4), then successively by U produces one-by-one the entries of the transform of \(\mathbf{f}\).

For the symmetric case, \(\lambda =2\), \(p=q=1/2\), T(N) has the binomial coefficients along the superdiagonals while, except for 1’s on the diagonal, the entries of U are \(\pm \,2\) on alternating superdiagonals. Thus,

$$\begin{aligned} T(4)=(I+ T)^4=\left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1&{}4&{}6&{}4&{}1\\ 0&{}1&{}4&{}6&{}4 \\ 0&{}0&{}1&{}4&{}6\\ 0&{}0&{}0&{}1&{}4 \\ 0&{}0&{}0&{}0&{}1\end{array} \right] \quad \text {and}\quad U= \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1&{}-2&{}2&{}-2&{}2\\ 0&{}1&{}-2&{}2&{} -2\\ 0&{}0&{}1&{}-2&{}2\\ 0&{}0&{}0&{}1&{}-2 \\ 0&{}0&{}0&{}0&{}1\end{array} \right] \ . \end{aligned}$$

Similarly, replacing the variables v and w in Eq. (3), by \(T_n\) and \(T_m\) respectively yields the formula

$$\begin{aligned}&\sum _{n,m}k_n(j,N)k_m(j,N) (T_n)^n (T_m)^m f(0)g(0)\\&\quad =\sum _n k_n(j,N) \,(1+\sigma ^2T_nT_m)^{N-n}(T_n+T_m+\lambda (q-p)T_nT_m)^n f(0)g(0) \end{aligned}$$

Representing \(T_mT_n\) by the Kronecker/tensor product of the corresponding shift matrices provides an explicit matrix that when applied to the tensor product of the vectors \(\mathbf{f}\) and \(\mathbf{g}\) yields the convolution \(f*g\). (See “Appendix” for an example.)

So the convolution can be computed analogously to the transform. Start with

$$\begin{aligned} T(N)=(1+\sigma ^2T_nT_m)^N\quad \text {and}\quad U=(1+\sigma ^2T_nT_m)^{-1}(T_n+T_m+\lambda (q-p)T_nT_m) \end{aligned}$$

and compute successively as in Eq. (5).

6 Dual transforms: binomial bases

Of course, one could define transforms dually by multiplying column vectors :

$$\begin{aligned} F(n)=\sum _j k_n(j) f(j) \qquad \text {or} \qquad \mathbf{F} =\Phi \mathbf{f} \ . \end{aligned}$$

Let’s begin with an example.

Example

For the symmetric case, we observe the result

$$\begin{aligned} \left[ \begin{array}{r@{\quad }r@{\quad }r@{\quad }r@{\quad }r} 1&{}1&{}1&{}1&{}1\\ 4&{}2&{}0&{}-2&{}-4 \\ 6&{}0&{}-2&{}0&{}6\\ 4&{}-2&{}0&{}2&{}-4 \\ 1&{}-1&{}1&{}-1&{}1\end{array} \right] \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1&{}1&{}1&{}1&{}1\\ 0&{}1&{}2&{}3&{}4 \\ 0&{}0&{}1&{}3&{}6\\ 0&{}0&{}0&{}1&{}4 \\ 0&{}0&{}0&{}0&{}1\end{array} \right] = \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1&{}2&{}4&{}8&{}16\\ 4&{}6&{}8&{}8&{}0 \\ 6&{}6&{}4&{}0&{}0\\ 4&{}2&{}0&{}0&{}0 \\ 1&{}0&{}0&{}0&{}0\end{array} \right] \ . \end{aligned}$$

The matrix on the left is \(\Phi ^{(4)}\). Observe that column, m, say, of binomial coefficients is mapped to its partner column indexed by \(N-m\), scaled by \(2^m\). Note that as functions, functions with zero tails are mapped to functions with zero tails, analogously to Fourier transforms of compactly supported functions or cutoff functions for filtering.

In general we have

Proposition 6.1

Let \(f_m(j)=\displaystyle \left( {\begin{array}{c}m\\ j\end{array}}\right) \,p^{m-j}q^j\). Then the dual transform, \(\mathbf{F}_m =\Phi \mathbf{f}_m\), is given by

$$\begin{aligned} F_m(n)=\left( {\begin{array}{c}N-m\\ n\end{array}}\right) \,(\lambda q)^n . \end{aligned}$$

Proof

We show the generating function version of the relation. Thus,

$$\begin{aligned} \sum _j (1+\lambda qv)^{N-j}&(1-\lambda pv)^j \displaystyle \left( {\begin{array}{c}m\\ j\end{array}}\right) \,p^{m-j}q^j\\&=(1+\lambda qv)^{N-m}(1+\lambda qv)^{m-j}(1-\lambda pv)^j\displaystyle \left( {\begin{array}{c}m\\ j\end{array}}\right) \,p^{m-j}q^j \\&=(1+\lambda qv)^{N-m}(q-\lambda pq v+p+\lambda pqv)^m\\&=(1+\lambda qv)^{N-m}=\sum _n v^n\left( {\begin{array}{c}N-m\\ n\end{array}}\right) \,(\lambda q)^n . \end{aligned}$$

\(\square \)

So in this basis, call it the binomial basis, \(\Phi \) is represented by a matrix with entries on the antidiagonal. Continuing our example, write

$$\begin{aligned} {\mathcal B}=\left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1&{}1&{}1&{}1&{}1\\ 0&{}1&{}2&{}3&{}4 \\ 0&{}0&{}1&{}3&{}6\\ 0&{}0&{}0&{}1&{}4 \\ 0&{}0&{}0&{}0&{}1\end{array} \right] \qquad \text {and} \qquad J=\left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0&{}0&{}0&{}0&{}1\\ 0&{}0&{}0&{}1&{}0\\ 0&{}0&{}1&{}0&{}0\\ 0&{}1&{}0&{}0&{}0 \\ 1&{}0&{}0&{}0&{}0\end{array} \right] \end{aligned}$$

With D the diagonal matrix \(\mathrm{diag}({1,2,2^2,2^3,2^4})\), we have

$$\begin{aligned} \Phi {\mathcal B}={\mathcal B}JD \qquad \text {or} \qquad {{\mathcal B}}^{-1}\Phi {\mathcal B}=JD \end{aligned}$$
(7)

with

$$\begin{aligned} JD = \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0&{}0&{}0&{}0&{}16\\ 0&{}0&{}0&{}8&{}0 \\ 0&{}0&{}4&{}0&{}0\\ 0&{}2&{}0&{}0&{}0 \\ 1&{}0&{}0&{}0&{}0\end{array} \right] \end{aligned}$$

the matrix representing the transform in the binomial basis. These relations extend to all N.

A related family of transforms is indicated by the similar calculation

$$\begin{aligned} \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1&{}0&{}0&{}0&{}0\\ 4&{}1&{}0&{}0&{}0 \\ 6&{}3&{}1&{}0&{}0\\ 4&{}3&{}2&{}1&{}0 \\ 1&{}1&{}1&{}1&{}1\end{array} \right] \Phi = \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1&{}1&{}1&{}1&{}1\\ 8&{}6&{}4&{}2&{}0 \\ 24&{}12&{}4&{}0&{}0\\ 32&{}8&{}0&{}0&{}0 \\ 16&{}0&{}0&{}0&{}0\end{array} \right] \end{aligned}$$

which can be expressed in the form

$$\begin{aligned} J{\mathcal B}J\Phi =D{\mathcal B}J \qquad \text {or} \qquad (J{{\mathcal B}}J)\Phi (J{\mathcal B}J)^{-1}=DJ \end{aligned}$$

with J, D, and \({\mathcal B}\) as above. Comparing with Eq. (7) indicates a connection between \(\Phi \) and \(\Phi ^T\). At this point it is straightforward to give a direct proof of the properties we want.

Proposition 6.2

Let \(f_i(n)=\displaystyle \left( {\begin{array}{c}N-n\\ N-i\end{array}}\right) \,p^{i-n}\lambda ^{-n}\). Then the transform, \(\mathbf{F}_i^\dag =\mathbf{f}_i^\dag \Phi \), is given by

$$\begin{aligned} F_i(j)=\left( {\begin{array}{c}N-j\\ i\end{array}}\right) \ . \end{aligned}$$

Proof

As in the previous proposition, we show the generating function version of the relation. Consider

$$\begin{aligned} \sum _i\sum _n \left( {\begin{array}{c}N-n\\ N-i\end{array}}\right) \,p^{i-n}\lambda ^{-n}k_n(j)v^i&=\sum _i\sum _n \left( {\begin{array}{c}N-n\\ i\end{array}}\right) p^{N-i} (\lambda p)^{-n}k_n(j) v^{N-i}\\&=\sum _n (pv)^n\sum _i \left( {\begin{array}{c}N-n\\ i\end{array}}\right) (vp)^{N-n-i} (\lambda p)^{-n}k_n(j) \\&=\sum _n (pv)^n(1+pv)^{N-n}(\lambda p)^{-n}\,k_n(j)\\&=(1+pv)^N\sum _n \left( \frac{v/\lambda }{1+pv}\right) ^n\,k_n(j)\\&=(1+pv)^N\,\left( 1+\frac{qv}{1+pv}\right) ^{N-j}\left( 1-\frac{pv}{1+pv}\right) ^{j}\\&=(1+v)^{N-j}=\sum _i \left( {\begin{array}{c}N-j\\ i\end{array}}\right) v^i \ . \end{aligned}$$

\(\square \)

7 Concluding remarks

We have presented Krawtchouk transforms which have the potential to provide an inherently discrete, efficient alternative to Fourier analysis. By presenting effective algorithms using matrix techniques to compute transforms and convolution products, we have demonstrated useful tools that are not only of theoretical interest but are ready for practical applications. As well, the special binomial transforms we have indicated provide a solid basis for filtering techniques. Thus, the Krawtchouk analogs of the standard Fourier toolkit are now available. Digital image analysis, for example, will provide an important arena for illustrating and developing Krawtchouk methods as presented in this work.