We will use the combinatorial interpretation of the (i, j)-th entry in the k-th power of a matrix as a weighted sum of paths from i to j.
Consider a set V together with a subset E of \(V\times V\). We may regard the pair (V, E) as a set with a binary relation or as a directed graph. Choosing the interpretation as a graph, we may associate monomials to paths and polynomials to finite sets of paths.
For our purposes, a path of length \(k\ge 1\) from a to b (where \(a,b\in V\)) in a directed graph (V, E) is a sequence \(e_1e_2\ldots e_{k}\) of edges \(e_i=(a_i,b_i)\in E\) such that \(a_1=a\), \(b_{k}=b\) and \(b_{j}=a_{j+1}\) for \(1\le j<k\). Also, for each \(a\in V\), there is a path of length 0 from a to a, which we denote by \(\varepsilon _a\). (For \(a\ne b\) there is no path of length 0 from a to b.)
We introduce a set of independent variables \(X=\{x_{v\, w}\mid (v,\, w)\in V\times V\}\) and consider the polynomial ring
$$\begin{aligned} R[X]=R[\{x_{v\, w}\mid (v,\, w)\in V\times V\}] \end{aligned}$$
with coefficients in the commutative ring R.
To each edge \(e=(a, b)\) in E, we associate the variable \(x_{ab}\) and to each path \(e_1e_2\ldots e_{k}\) of length k, with \(e_i=(a_i,b_i)\), the monomial of degree k which is the product of the variables associated to the edges of the path: \(x_{a_1b_1}x_{a_2b_2}\ldots x_{a_kb_k}\). To each path of length 0 we associate the monomial 1.
If E is finite, or, more generally, if for any pair of vertices \(a,b\in V\) and fixed \(k\ge 0\) there are only finitely many paths in (V, E) from a to b, we define the k-th path polynomial from a to b, denoted by \(p_{a b}^{(k)}\), as the sum in R[X] of the monomials corresponding to paths of length k from a to b. If there is no path of length k from a to b in (V, E), we set \(p_{a b}^{(k)}=0\).
From now on, we fix the relation \((\mathbb {N}, \le )\) and all path polynomials will refer to the graph of \((\mathbb {N}, \le )\), where \(\mathbb {N}=\{1, 2, 3, \ldots \}\), or one of its finite subgraphs given by intervals. The finite subgraph given by the interval
$$\begin{aligned}{}[i,j]=\{k\in \mathbb {N}\mid i\le k\le j\} \end{aligned}$$
is the graph with set of vertices [i, j] and set of edges \(\{(a,b)\mid i\le a\le b\le j\}\).
Because of the transitivity of the relation “\(\le \)”, a path in \((\mathbb {N},\le )\) from a to b involves only vertices in the interval \([a,\, b]\). The path polynomial \(p_{a b}^{(k)}\), therefore, is the same whether we consider a, b as vertices in the graph \((\mathbb {N},\le )\), or any subgraph given by an interval [i, j] with \(a,b\in [i,j]\). So we may suppress all references to intervals and subgraphs and define:
Definition 2.1
Let R be a commutative ring. The k-th path polynomial from i to j (corresponding to the relation \((\mathbb {N},\le )\)) in R[x] is defined by
-
(1)
For \(1\le i\le j\) and \(k> 0\) by
$$\begin{aligned} p_{ij}^{(k)}=\sum _{i=i_1\le i_2\le \ldots \le i_{k+1}=j} x_{i_1 i_2}x_{i_2 i_3}\ldots x_{i_{k-1}i_k}x_{i_k i_{k+1}}, \end{aligned}$$
-
(2)
for \(1\le i \le j\) and \(k=0\) by \(p_{ij}^{(0)}=\delta _{ij}\),
-
(3)
for \(i>j\) and all k: \(p_{ij}^{(k)}=0\).
For \(a,b\in \mathbb {N}\), we define the sequence of path polynomials from a to b as
$$\begin{aligned} p_{a b}=(p_{a b}^{(k)})_{k\ge 0}. \end{aligned}$$
Remark 2.2
Note that \(p_{ij}^{(k)}\) is the (i, j)-th entry of the k-th power of a generic upper triangular \(n\times n\) matrix (with \(n\ge i,j\)) whose (i, j)-th entry is \(x_{ij}\) when \(i\le j\) and zero otherwise.
Example 2.3
The sequence of path polynomials from 2 to 4 is
$$\begin{aligned} p_{2 4}=(0,\;\; x_{24},\;\; x_{22}x_{24}+x_{23}x_{34}+x_{24}x_{44},\;\ldots \ ) \end{aligned}$$
and
$$\begin{aligned} p^{(3)}_{24}= x_{22}x_{22}x_{24} + x_{22}x_{23}x_{34} + x_{22}x_{24}x_{44} + x_{23}x_{33}x_{34} + x_{23}x_{34}x_{44} + x_{24}x_{44}x_{44} \end{aligned}$$
Again, note that \(p_{2 4}\) is the sequence of entries in position (2, 4) in the powers \(G^0,G, G^2, G^3,\ldots \) of a generic \(n\times n\) (with \(n\ge 4\)) upper triangular matrix \(G=(g_{ij})\) with \(g_{ij}=x_{ij}\) for \(i\le j\) and \(g_{ij}=0\) otherwise.
In addition to right and left substitution of a matrix for the variable in a polynomial in R[x] or \((\mathrm{T}_n(R))[x]\), we are going to use another way of plugging matrices into polynomials, namely, into polynomials in \(R[X]=R[\{x_{ij}\mid i,j\in \mathbb {N}\}]\). For this purpose, the matrix \(C=(c_{ij})\in M_n(R)\) is regarded as a vector of elements of R indexed by \(\mathbb {N}\times \mathbb {N}\), with \(c_{ij}=0\) for \(i>n\) or \(j>n\):
Definition 2.4
For a polynomial \(p\in R[X]=R[\{x_{ij}\mid i, j\in \mathbb {N}\}]\) and a matrix \(C=(c_{ij})\in M_n(R)\) we define p(C) as the result of substituting \(c_{ij}\) for those \(x_{ij}\) in p with \(i,j\le n\) and substituting 0 for all \(x_{kh}\) with \(k>n\) or \(h>n\).
To be able to describe the (i, j)-th entry in f(C), where \(f\in R[x]\), we need one more construction: for sequences of polynomials \(p=(p_i)_{i\ge 0}, q=(q_i)_{i\ge 0}\) in R[X], at least one of which is finite, we define a scalar product \(\langle p, q\rangle =\sum _i p_i q_i\). Actually, we only need one special instance of this construction, that where one of the sequences is the sequence of coefficients of a polynomial in R[x] and the other a sequence of path polynomials from a to b.
Definition 2.5
Given \(f=f_1+f_1x+\cdots f_mx^m\in R[x]\) (which we identify with the sequence of its coefficients), \(a,\, b\in \mathbb {N}\), and \(p_{a b}=(p_{a b}^{(k)})_{k=0}^\infty \) the sequence of path polynomials from a to b as in Definition 2.1, we define
$$\begin{aligned} \langle f,\, p_{a b}\rangle =\sum _{k\ge 0} f_k\, p_{a b}^{(k)}. \end{aligned}$$
Definition 2.6
For a polynomial \(p\in R[X]=R[\{x_{ij}\mid i,j\in \mathbb {N}\}]\) and \(S\subseteq R\), we define the image \(p(S^{*})\subseteq R\) as the set of values of p as the variables occurring in p range through S independently. (The star in \(S^{*}\) serves to remind us that the arguments of p are not elements of S, but k-tuples of elements of S for unspecified k.)
We define \({{\mathrm{Int}}}(S^{*},I)\) as the set of those polynomials in \(R[X]=R[\{x_{ij}\mid i,j\in \mathbb {N}\}]\) that take values in I whenever elements of S are substituted for the variables.
The notation \({{\mathrm{Int}}}(S^{*},I)\) is suggested by the convention that \({{\mathrm{Int}}}(S,I)\) consists of polynomials in one indeterminate mapping elements of S to elements of I and, for \(k\in \mathbb {N}\), \({{\mathrm{Int}}}(S^{k},I)\) consists of polynomials in k indeterminates mapping k-tuples of elements of S to I.
We summarize here the connection between path polynomials and the related constructions of Definitions 2.1, 2.4, 2.5 and 2.6 with entries of powers of matrices and entries of the image of a matrix under a polynomial function.
Remark 2.7
Let R be a commutative ring, \(C\in \mathrm{T}_n(R)\), \(k\ge 0\), \(1\le i,j\le n\), and \(p_{ij}^{(k)}\) the k-the path polynomial from i to j in R[x] as in Definition 2.1.
-
(1)
\(\left[ C^k\right] _{i j} = p_{ij}^{(k)}(C)\)
-
(2)
For \(f\in R[x]\), \(\left[ f(C)\right] _{i j} = \langle f, p_{i j}\rangle (C)\).
-
(3)
If the i-th row or the j-th column of C is zero then \(p_{ij}^{(k)}(C)=0\), and for all \(f\in R[x]\), \(\langle f, p_{i j}\rangle (C)=0\).
-
(4)
\(p_{ij}^{(k)}(S^{*})= \{p_{ij}^{(k)}(C)\mid C\in \mathrm{T}_n(S)\}.\)
Proof
(1) and (2) follow immediately from Definitions 2.1, 2.4 and 2.5. Compare Remark 2.2. (3) follows from (2) and Definition 2.4, since every monomial occurring in \(p_{ij}^{(k)}\) involves a variable \(x_{im}\) for some m and a variable \(x_{hj}\) for some h. Also, (4) follows from Definitions 2.1 and 2.4.
Lemma 2.8
Let \(f\in R[x]\). The image of \(\langle f, p_{ij}\rangle \) under substitution of elements of S for the variables depends only on f and \(j-i\), that is, for all \(i\le j\) and all \(m\in \mathbb {N}\)
$$\begin{aligned} \langle f, p_{ij}\rangle (S^{*})= \langle f, p_{i+m\> j+m}\rangle (S^{*}) \end{aligned}$$
Proof
The R-algebra isomorphism
$$\begin{aligned} \psi :R[\{x_{hk}\mid i\le h\le k\le j\}]\rightarrow R[\{x_{hk}\mid i+m\le h\le k\le j+m\}] \end{aligned}$$
with \(\psi (x_{hk}) = \psi (x_{h+m\, k+m})\) and \(\psi (r)=r\) for all \(r\in R\) maps \(\langle f, p_{ij}\rangle \) to \(\langle f, p_{i+m\> j+m}\rangle \).
Applying \(\psi \) amounts to a renaming of variables; it doesn’t affect the image of the polynomial function resulting from substituting elements of S for the variables. \(\square \)
Proposition 2.9
Let \(f\in R[x]\). The following are equivalent
-
(1)
\(f\in {{\mathrm{Int}}}_R(T_n(S), T_n(I))\)
-
(2)
\(\forall \; 1\le i\le j\le n\quad \langle f, p_{ij}\rangle \in {{\mathrm{Int}}}(S^{*},I)\)
-
(3)
\(\forall \; 0\le k\le n-1\quad \exists i\in \mathbb {N}\; \langle f, p_{i\> i+k}\rangle \in {{\mathrm{Int}}}(S^{*},I)\)
Proof
The (i, j)-th entry of f(C), for \(C\in \mathrm{T}_n(R)\), is \(\langle f, p_{ij}\rangle \)(C), by Remark 2.7 (2). If C varies through \(\mathrm{T}_n(S)\), then all variables occurring in \(\langle f, p_{ij}\rangle \) vary through S independently. This shows the equivalence of (1) and (2).
By Lemma 2.8, the image of \(\langle f, p_{ij}\rangle \) as the variables range through S depends only on f and \(j-i\). This shows the equivalence of (2) and (3). \(\square \)