1 Introduction

Matrix norms are often used in the analysis of algorithms. A map \( \Vert \cdot \Vert :M_{m\times n} \rightarrow \mathbb {R} \) is a matrix norm if, for all \( A, B \in M_{m\times n} \), the following five axioms are satisfied: (i) non-negativity, (ii) definiteness, (iii) absolute homogeneity, (iv) triangular inequality and (v) submultiplicativity. The first four properties of a matrix norm are identical to the axioms of a vector norm. That is, a map on matrices that satisfies (i), (ii), (iii), (iv) is called a vector norm or a generalized matrix norm. The notions of a matrix seminorm and a generalized matrix seminorm are defined via omission of axiom (ii). Deutsch (1980) presented two methods of generating classes of generalized matrix norms. Johnson (1977) explored the relation between multiplicativity and compatibility for a generalized matrix norm. Dyer et al. (2009) showed that rapid mixing of both random update Glauber dynamics and systematic scan Glauber dynamics occurs if any matrix norm of the associated dependency matrix is less than 1. The transition probability matrix is an intrinsic part of the Markov chain. Koppula et al. (2019) presented a link between the Markov chains and rough sets. Kedukodi et al. (2009) defined the concepts of equiprime fuzzy ideal, 3-prime fuzzy ideal and c-prime fuzzy ideal of a nearring. Kedukodi et al. (2019) introduced interval valued equiprime, 3-prime and c-prime L-fuzzy ideals of a nearring N by using interval valued t-norms and interval valued t-conorms and characterized them. Subsequently, Koppula et al. (2020) related these ideas with Markov frameworks and gave their application in decision-making problems. Aishwarya et al. (2022) investigated permutation identities satisfied by weak semigroup left ideals in prime nearrings and obtained results on commutativity of addition and multiplication using these identities. Aishwarya et al. (2023) introduced the notion of a product fractal ideal of a ring using permutations of finite sets and multiplication operation in the ring and proved fractal isomorphism theorems that extend the classical isomorphism theorems in rings.

Higham (1992) showed that Hölder \( p- \)norm of a matrix can be estimated reliably in O(mn) operations. Hendrickx and Olshevsky (2010) showed that for any rational \( p \in \left[ 1, \infty \right) \) except \( p = 1, 2\), unless \( P = NP \), there is no polynomial time algorithm which approximates the matrix p-norm to arbitrary relative precision. Li et al. (2014) approximated \( \Vert A\Vert _p \) using sketching models. Arens (1952) gave a generalization of normed rings. Ozaki et al. (1953) obtained a sufficient condition for an analytic function to be univalent in a convex domain in a normed ring. Duran and İzgi (2014, 2015) studied the behavior of solutions for several stochastic differential equations, introduced three-dimensional matrix norms and defined market impression matrix norm as an application to it. Further, İzgi (2015) investigated different stochastic differential equations in the field of mathematical finance. İzgi and Özkaya (2017) presented and proved norm inequalities for 3-dimensional matrices and showed their usefulness in real data applications. Kowalski (2009) studied the role of mixed norms in enforcing some specific types of joint sparsity and diversity when used as regularization terms in regression problems. Yildirim and Özkale (2022) proposed an algorithm based on the combination of ridge and Liu regressions to deal with multicollinearity in extreme learning machine.

The lambda calculus is a theory of functions as formulas. In this system, the functions are written as expressions. Lambda calculus was introduced by Church (1936) in 1936 to formalize the concept of effective computability. Also, in 1936, before learning of Church’s work, Turing (1936) created a theoretical model for machines, now called Turing machines. A Turing machine could compute with inputs by manipulating symbols on a tape. A function on the natural numbers is called \( \lambda - \)computable if the corresponding function on the Church numerals can be represented by a \( \lambda - \) term. A function is Turing computable if a Turing machine can compute the function’s value. For each \( \lambda \)-term M, a tree BT(M) , called Böhm tree, is defined (see Barendregt (1984)). Böhm trees play an important role in analysis of \( \lambda \)-models \( P_\omega \) and \( D_\infty \). The terms and their Böhm trees relate to each other analogously as real numbers relate to its continued fraction expansion. If M has a normal form, the tree BT(M) is finite. In this respect, normal forms relate to rational numbers. Nayak et al. (2018) defined \( \varTheta \varGamma \) N-Group and showed that Church numerals and Church pairs form a \( \varTheta \varGamma \) N-Group. Koppula et al. (2021) showed that Church numerals naturally form a seminearring and give rise to perfect ideals.

A magic square M of order n is an \( n \times n \) array of integers from the set \( \{1, 2,\dots , n^2\} \), wherein the integers in each row, in each column, in the main diagonal, and the main anti-diagonal all add up to the same number called the magic constant. The magic constant of a magic square of order n is \( n (n^2 + 1)/2\). Furthermore, magic squares exist for all positive orders, except in the case of \( n = 2 \). A Latin square of order n is an \( n \times n \) array over a set of n symbols such that every symbol appears exactly once in each row and exactly once in each column. These symbols can be letters, numbers, colors, etc. McCranie (1988) gave methods for constructing magic squares of all orders that can be adapted for use on a general-purpose electronic digital computer. Benjamin and Brown (2014) presented several effective ways of constructing magic squares given the magic constant. Keedwell and Dénes (2015) gave applications of Latin squares and their connections with Magic squares. Fisher (1926) recommended Latin squares for crop experiments, and Fisher (1971) discussed Greaco-Latin and higher squares. Shao and Wei (1992) gave a formula for the number of Latin squares of a given order. Latin squares encode the features of algebraic structures. Wanless (2007) gave a brief survey of generalizations of transversals, the case when the Latin square is a group table and its connection with covering radii of sets of permutations. Potapov (2016) gave the bounds for the number of transversals over all the Latin squares of order n and showed that the logarithm of the maximum number of transversals over all the Latin squares of order n is greater than \( \dfrac{n}{6}(\ln n + O(1)) \). Vadiraja Bhatta and Shankar (2017) considered Latin squares formed by some permutation polynomials over finite rings and studied them with regard to a few Latin square properties. Latin squares have applications in coding theory, experimental designs and designing game tournaments. The famous puzzles like Sudoku and KenKen are special Latin squares. Johanna et al. (2012) presented a hybrid genetic algorithm that can solve the KenKen puzzle. In this paper, we consider Latin squares and magic squares and compute their weighted path norms, which distinguish and order all possible solutions.

The condition number \( \kappa \) of a matrix measures the sensitivity of the solution when the system experiences small changes in the coefficients. Matrix norms are used to compute condition number. The matrix condition number was introduced by Turing (1948). Marshall and Olkin (1965) proposed that under certain conditions, the matrix \( AA^*\) is more ill-conditioned than A. Narcowich and Ward (1991) gave a general method for obtaining bounds both on the norm of the inverse of the interpolation matrix and on the condition number of that matrix. Zielke (1988) established inequalities between norms of rectangular matrices and the corresponding relationships between condition numbers. We refer to Higham (2002) for the detailed study of condition numbers. We use path norms to compute condition number and give a comparison with other methods.

We have organized the rest of the paper as follows. In Sect. 2, we provide the basic definitions and related primary results. In Sect. 3, we define the binary path seminorm. In Sect. 4, we define the binary path norm, present a quadratic running time dynamic algorithm and discuss different path norms. In Sect. 5, we present the binary wrapping path seminorm and norm. Different versions of ternary path norms analogous to binary path norms are discussed in Sects. 6, 7 and 8. In Sect. 9, we define the path norm on Church numerals and Church pairs. In Sect. 10, we define weighted path norm and strictly weighted path norm and compute them for all Latin squares and Magic squares of order 3. In Sect. 11, the condition numbers for different matrices are computed using path norm and they are compared with condition numbers computed from the standard matrix norms \( \Vert A\Vert _p\), \( p = 1,2,\infty \).

2 Basic definitions and preliminaries

In this section, we provide basic definitions and results. For the fundamentals of analysis of algorithms, we refer to Cormen, Cormen et al. (2001). For further understanding of vector norms and matrix norms, we refer to Meyer (2000), Strang (2006).

Definition 2.1

Naimark (1964) Let R be an associative ring. A norm on R is a function \( \mid \; \mid : R \rightarrow \mathbb {R} \) that satisfies the following conditions for all \( a, b \in R \):

  1. (i)

    \( \mid a\mid \; \ge 0 \) and \( \mid a\mid \; =0 \) if and only if \( a=0 \);

  2. (ii)

    \( \mid a+b\mid \; \le \;\mid a\mid +\mid b\mid \);

  3. (iii)

    \( \mid ab\mid \;\le \;\mid a\mid \mid b\mid \).

A ring R on which a norm is defined is called as a normed ring.

In what follows, R denotes a normed ring. We denote the set of all matrices of order \( m \times n \) with entries from R by \( M_{m\times n}(R) \).

Definition 2.2

Horn and Johnson (2013) The function \( \Vert \;\Vert : M_{m\times n}(R) \rightarrow \mathbb {R} \) is said to be a seminorm if it satisfies the following properties:

  1. (i)

    \(\Vert A\Vert \ge 0\) (non-negativity);

  2. (ii)

    \( \Vert cA\Vert \le \;\mid c\mid \Vert A \Vert \) (weak absolute homogeneity)

  3. (iii)

    \(\Vert A+B\Vert \le \Vert A \Vert +\Vert B \Vert \) (triangular inequality);

The seminorm of a nonzero matrix can be zero.

Definition 2.3

Horn and Johnson (2013) The function \( \Vert \;\Vert : M_{m\times n}(R) \rightarrow \mathbb {R} \) is said to be a norm if it satisfies the following properties:

  1. (i)

    \(\Vert A\Vert \ge 0\) (non-negativity);

  2. (ii)

    \(\Vert A\Vert = 0\) if and only if \( A=0 \) (non-negativity);

  3. (iii)

    \( \Vert cA\Vert \le \;\mid c\mid \Vert A \Vert \) (weak absolute homogeneity)

  4. (iv)

    \(\Vert A+B\Vert \le \Vert A \Vert +\Vert B \Vert \) (triangular inequality);

  5. (v)

    \(\Vert Ax\Vert \le \Vert A \Vert \Vert x \Vert \) (compatibility with a vector);

Definition 2.4

Horn and Johnson (2013) A square matrix A of the form

is called a circulant matrix, denoted by \( cir(a_1,a_2,\dots ,a_n) \). Each row is the previous row cycled backwards by one step; the entries in each row are a cyclic permutation of those in the first.

Definition 2.5

Golub and Van Loan (2013)

Let \( A \in M_{m\times n}(R)\). The column norm of a matrix is given by

$$\begin{aligned} \Vert A\Vert _1 = \max _{1\le j \le n} \sum _{i=1}^{m}\mid a_{ij} \mid . \end{aligned}$$

Definition 2.6

Turing (1948) For a non-singular square matrix A, the condition number is denoted and defined as follows:

$$\begin{aligned} \kappa (A)=\Vert A \Vert \Vert A^{-1} \Vert . \end{aligned}$$

Definition 2.7

Hindley and Seldin (2008)

Let \( V = \{v,v',v'', \dots \} \) be an infinite set of variables. The set of \( \lambda \)-terms, denoted by \( \varLambda \), is defined inductively as follows:

$$\begin{aligned} x \in V&\implies x \in \varLambda ;\\ M,N \in \varLambda&\implies (MN)\in \varLambda \; (\text {application});\\ M \in \varLambda , x \in V&\implies (\lambda x.M) \in \varLambda \; (\text {abstraction}) \end{aligned}$$

where M and N are \( \lambda - \)terms.

Any natural number n can be represented using a \( \lambda - \)term.

Definition 2.8

Hindley and Seldin (2008) For every \( n \in \mathbb {N} \), the Church numeral is a term defined by

$$\begin{aligned} \overline{n} = \lambda fx.f^nx. \end{aligned}$$

Here are the first few Church numerals:

$$\begin{aligned} \overline{0}&= \lambda fx.x\\ \overline{1}&= \lambda fx.fx\\ \overline{2}&= \lambda fx.f(fx)\\ \overline{3}&= \lambda fx.f(f(fx)) \end{aligned}$$

If f and x are lambda terms, and \( n \ge 0 \) a natural number, we write \( f^nx \) for the term \( f(f(\dots (fx)\dots )) \), where f occurs n times. Let C denote the set of all Church numerals. The addition and multiplication on C are defined as follows:

$$\begin{aligned} {\begin{matrix} + &{}\equiv \lambda mnfx.mf(nfx)\\ *&{}\equiv \lambda mnfx.m(nf)x. \end{matrix}} \end{aligned}$$
(1)

Let \( \overline{m}, \overline{n} \in C \).

$$\begin{aligned} \overline{m}+ \overline{n}&= (\lambda mnfx.mf(nfx))(\lambda fx. f^m x)(\lambda gy.g^n y)\nonumber \\&\triangleright _\beta \lambda fx.f^{m+n} x\nonumber \\&= \overline{m+n} \end{aligned}$$
(2)

and

$$\begin{aligned} \overline{m} *\overline{n}&= (\lambda mnfx.m(nf)x)(\lambda fx. f^m x)(\lambda gy.g^n y)\nonumber \\&\triangleright _\beta \lambda fx.f^{m\times n} x\nonumber \\&= \overline{m\times n}. \end{aligned}$$
(3)

A Church pair is a pair of Church numerals that represents integers in \( \lambda \)-terms. The integer value is the difference between the two Church numerals. It contains Church numerals representing a positive and a negative value.

Let x be an integer. It is represented as a pair of natural numbers \( (x_p,x_n) \) such that \( x = x_p - x_n \). In \( \lambda \)-terms, the pair function is given by \( \lambda x y f. f x y \) and x as \( \lambda f. f (\lambda gy.g^{x_p} y) (\lambda hz.h^{x_n} z) \).

The addition and multiplication on Church pairs are naturally defined as follows:

$$\begin{aligned} x \oplus y&= [x_p,x_n] \oplus [y_p,y_n] = [x_p+y_p , x_n+y_n] \end{aligned}$$
(4)
$$\begin{aligned} {\begin{matrix} x \circledast y &{}= [x_p,x_n] \circledast [y_p,y_n] = [x_p *y_p +x_n *y_n,\\ &{}\qquad \qquad \qquad x_p*y_n + x_n *y_p] \end{matrix}} \end{aligned}$$
(5)

where \( + \) and \( *\) are as defined in (1).

The Böhm trees are defined as follows:

Definition 2.9

Barendregt (1984) Let \( \varSigma \) be set of symbols. A \( \varSigma \)-labeled tree is a tree where at each node an element of \( \varSigma \) is written.

Let

$$\begin{aligned} {\begin{matrix} \varSigma \ = \ &{}\{\perp \} \;\\ &{}\cup \; \{\lambda x_1 \dots x_n . y \mid \; n\in \mathbb {N}, x_1,\dots ,x_n,y \text { variables} \}. \end{matrix}} \end{aligned}$$

Then, BT(M) is a \( \varSigma - \)labeled tree defined as follows:

where \( \bot \) is a single node with label \( \perp \).

3 Path seminorm on a matrix

Definition 3.1

We define \( n_b: M_{m\times n} (R) \rightarrow \mathbb {R} \) as follows:

$$\begin{aligned} n_b(A)= \max \bigg \{\mid \sum _{i=1}^{m} a_{ij_i} \mid \; : \;&1 \le j_1,j_2,\dots ,j_m \le n , \\ {}&j_s = j_{s-1} \text { or } j_{s-1}+1, \\&\text { for } 2 \le s \le m\bigg \}. \end{aligned}$$

We call the conditions on \( j_i \)’s as

$$\begin{aligned} \left. \begin{array}{rl} \quad &{}1 \le j_1,j_2,\dots ,j_m \le n\\ \quad &{}j_s = j_{s-1} \text { or } j_{s-1}+1\\ \quad &{}\text {for } 2 \le s \le m \end{array} \right\} \end{aligned}$$
(6)

Proposition 3.2

For \( A,B \in M_{m\times n} (R) \), \(n_b\) satisfies the following properties:

  1. (i)

    \(n_b(A)\ge 0\) (non-negativity);

  2. (ii)

    \( n_b(cA) \le \; \mid c\mid n_b(A) \), for any c in R (weak absolute homogeneity);

  3. (iii)

    \(n_b(A+B)\le n_b(A)+n_b(B)\) (triangular inequality);

  4. (iv)

    \(n_b\) is a seminorm on \( M_{m\times n}(R) \).

Proof

Let A and B be \( m \times n \) matrices from the normed ring R and c be an element in R.

  1. (i)

    By Definition 3.1, it is clear that \(n_b(A)\ge 0\).

  2. (ii)

    \( \begin{aligned} n_b(cA)&= \max \bigg \{\mid \sum _{i=1}^{m} ca_{ij_i} \mid \;:\; \text {satisfying } (6) \bigg \}\\&\le \max \bigg \{\mid c \mid \, \mid \sum _{i=1}^{m} a_{ij_i} \mid \;:\; \text {satisfying } (6) \bigg \}\\&= \mid c \mid \max \bigg \{\mid \sum _{i=1}^{m} a_{ij_i} \mid \;:\; \text {satisfying } (6) \bigg \}\\&= \mid c \mid n_b(A). \end{aligned} \)

  3. (iii)

    Let \( n_b(A+B) =\mid \sum \limits _{i=1}^{m} r_{ij_i} \mid .\) Then,

    $$\begin{aligned} n_b(A+B)&= \mid \sum _{i=1}^{m} r_{ij_i} \mid \\&= \mid \sum _{i=1}^{m} (a_{ij_i}+b_{ij_i}) \mid \\&= \mid \left( \sum _{i=1}^{m} a_{ij_i}\right) + \left( \sum _{i=1}^{m} b_{ij_i}\right) \mid \\&\le \mid \sum _{i=1}^{m} a_{ij_i} \mid + \mid \sum _{i=1}^{m} b_{ij_i} \mid \\&\le \max \bigg \{\mid \sum _{i=1}^{m} a_{ij_i} \mid \; : \text {satisfying } (6) \bigg \} \\&\quad + \max \bigg \{\mid \sum _{i=1}^{m} b_{ij_i} \mid \; : \text {satisfying } (6) \bigg \}\\&= n_b(A) + n_b(B). \end{aligned}$$
  4. (iv)

    Follows from (i), (ii), (iii).

\(\square \)

We call \( n_b \) as binary row path seminorm.

4 Path norm on a matrix

Definition 4.1

We define \( N_b: M_{m\times n} (R) \rightarrow \mathbb {R} \) as follows:

$$\begin{aligned} N_b(A)= \max \bigg \{\sum _{i=1}^{m}\mid a_{ij_i} \mid \; : \text {satisfying } (6) \bigg \}. \end{aligned}$$

We denote this way of computing \( N_b \) by \( R \downarrow R \).

Proposition 4.2

For \( A,B \in M_{m\times n} (R) \), \( N_b \) satisfies the following properties:

  1. (i)

    \(N_b(A)\ge 0\) (non-negativity);

  2. (ii)

    \(N_b(A)=0\) if and only if \(A=0\) (definiteness);

  3. (iii)

    \( N_b(cA) \le \; \mid c\mid N_b(A) \), for any c in R (weak absolute homogeneity);

  4. (iv)

    \(N_b(A+B)\le N_b(A)+N_b(B)\) (triangular inequality);

  5. (v)

    \(N_b(Ax)\le N_b(A)N_b(x)\) for any column vector x (compatibility with a vector norm);

  6. (vi)

    \(N_b\) is a norm on \( M_{m\times n}(R) \).

Proof

Let A and B be \( m \times n \) matrices from the normed ring R and c be an element in R.

  1. (i)

    By Definition 4.1, it is clear that \(N_b(A)\ge 0\).

  2. (ii)

    Suppose that \( A=0 \). Clearly, \( N_b(A) = 0 \).

    Conversely, suppose that \( N_b(A) = 0 \), that is,

    \( \max \bigg \{\sum \limits _{i=1}^{m}\mid a_{ij_i} \mid \;: \text {satisfying } (6) \bigg \}= 0 \). This implies \( \sum \limits _{i=1}^{m}\mid a_{ij_i} \mid = 0 \), for all ij satisfying (6).

    This implies \( a_{ij_i} = 0 \) for all ij satisfying (6). That is, \( a_{ij} = 0 \) for all \( 1\le i \le m \) and \( 1\le j \le n \).

    Thus, \( A = 0 \).

  3. (iii)

    \( \begin{aligned} N_b(cA)&= \max \bigg \{\sum _{i=1}^{m}\mid ca_{ij_i} \mid \;: \text {satisfying } (6) \bigg \}\\&\le \max \bigg \{\mid c \mid \sum _{i=1}^{m}\mid a_{ij_i} \mid \;: \text {satisfying } (6) \bigg \}\\&= \mid c \mid \max \bigg \{ \sum _{i=1}^{m}\mid a_{ij_i} \mid \;: \text {satisfying } (6) \bigg \}\\&= \mid c \mid N_b(A). \end{aligned} \)

  4. (iv)

    Let \( N_b(A+B) = \sum \limits _{i=1}^{m}\mid r_{ij_i} \mid \).

    $$\begin{aligned} N_b(A+B)&= \sum \limits _{i=1}^{m}\mid r_{ij_i} \mid \\&= \sum \limits _{i=1}^{m} \mid a_{ij_i}+b_{ij_i} \mid \\&\le \sum \limits _{i=1}^{m} \mid a_{ij_i} \mid + \sum \limits _{i=1}^{m} \mid b_{ij_i} \mid \\&\le \max \bigg \{\sum _{i=1}^{m}\mid a_{ij_i} \mid \; : \text {satisfying } (6) \big \} \\&\quad + \max \bigg \{\sum _{i=1}^{m}\mid b_{ij_i} \mid \; : \text {satisfying } (6) \big \}\\&= N_b(A) + N_b(B). \end{aligned}$$
  5. (v)

    \( A = \begin{pmatrix} a_{11} &{} a_{12} &{} a_{13} &{} \cdot &{} \cdot &{} \cdot &{} a_{1n} \\ a_{21} &{} a_{22} &{} a_{23} &{} \cdot &{} \cdot &{} \cdot &{} a_{2n} \\ a_{31} &{} a_{32} &{} a_{33} &{} \cdot &{} \cdot &{} \cdot &{} a_{3n} \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots \\ a_{m1} &{} a_{m2} &{} a_{m3} &{} \cdot &{} \cdot &{} \cdot &{} a_{mn} \end{pmatrix} \), \( x = \begin{pmatrix} x_1\\ x_2\\ x_3\\ \vdots \\ x_n \end{pmatrix}. \)

    $$\begin{aligned} Ax&= \begin{pmatrix} a_{11} x_1+ a_{12}x_2+ a_{13} x_3+ \cdots + a_{1n}x_n \\ a_{21} x_1+ a_{22} x_2+ a_{23}x_3+ \cdots + a_{2n}x_n \\ \vdots \\ a_{m1} x_1+ a_{m2} x_2+ a_{m3} x_3+ \cdots + a_{mn}x_n \end{pmatrix}\\&= \alpha _1 E_1 + \alpha _2 E_2 + \cdots +\alpha _n E_n \end{aligned}$$

    where \( \alpha _i = \displaystyle {\sum _{j=1}^{n}a_{ij}}x_j \) and \( E_i = \begin{pmatrix} 0\\ 0\\ 1 \\ \vdots \\ 0 \end{pmatrix}\) is the column matrix with one in ith row and zero elsewhere. \( \begin{aligned} N_b&(Ax) = N_b(\alpha _1 E_1 + \alpha _2 E_2 + \cdots +\alpha _m E_m)\\&\le N_b(\alpha _1 E_1 )+ N_b(\alpha _2 E_2 )+ \cdots + N_b(\alpha _m E_m )\\&= \mid \alpha _1 \mid N_b(E_1) + \mid \alpha _2 \mid N_b(E_2) + \cdots \\ {}&\qquad + \mid \alpha _m \mid N_b(E_m) \\&= \mid \alpha _1 \mid + \mid \alpha _2 \mid + \cdots + \mid \alpha _m \mid \\&\qquad \text {because } N_b(E_i) = 1 \text { for each } i,\; 1\le i \le m.\\&= \mid a_{11} x_1+ a_{12}x_2+ a_{13} x_3+ \cdots + a_{1n}x_n \mid \nonumber \\&\quad + \mid a_{21} x_1+ a_{22} x_2+ a_{23}x_3+ \cdots + a_{2n}x_n \mid + \cdots \nonumber \\&\quad + \mid a_{m1} x_1+ a_{m2} x_2+ a_{m3} x_3+ \cdots + a_{mn}x_n \mid \\&\le \mid a_{11} \mid \mid x_1 \mid + \mid a_{12}\mid \mid x_2\mid + \mid a_{13}\mid \mid x_3\mid + \cdots \nonumber \\&\quad +\mid a_{1n}\mid \mid x_n \mid + \mid a_{21}\mid \mid x_1\mid +\mid a_{22}\mid \mid x_2\mid \nonumber \\&\quad + \mid a_{23}\mid \mid x_3\mid + \cdots + \mid a_{2n}\mid \mid x_n \mid + \cdots \nonumber \\&\quad + \mid a_{m1} \mid \mid x_1\mid + \mid a_{m2}\mid \mid x_2\mid + \mid a_{m3} \mid \mid x_3\mid \nonumber \\&\quad + \cdots +\mid a_{mn}\mid \mid x_n \mid \\&= (\mid a_{11} \mid +\mid a_{21} \mid +\cdots +\mid a_{m1} \mid )\mid x_1 \mid \nonumber \\&\quad + (\mid a_{12} \mid +\mid a_{22} \mid +\cdots +\mid a_{m2} \mid )\mid x_2 \mid + \cdots \nonumber \\&\quad + (\mid a_{1n} \mid +\mid a_{2n} \mid +\cdots +\mid a_{mn} \mid )\mid x_n \mid \\&\le N_b(A) \mid x_1 \mid + N_b(A) \mid x_2 \mid + \cdots + N_b(A) \mid x_n \mid \\&= N_b(A) (\mid x_1 \mid + \mid x_2 \mid + \cdots + \mid x_n \mid )\\&= N_b(A) N_b(x). \end{aligned} \)

  6. (vi)

    Follows from (i), (ii), (iii), (iv), (v).

\(\square \)

We call \( N_b \) as binary row path norm.

Proposition 4.3

The brute-force method of computing \( N_b \) has exponential running time.

Proof

Let \( A = (a_{ij}) \) be an \( m\times n \) matrix.

figure a

The brute-force method of computing binary row path norm is to find all possible paths \( a_{1j_1},a_{2j_2}, a_{3j_3},\) \(\dots , a_{mj_m} \) satisfying (6). Basically, we compute the sum of the absolute values of the elements from the path and then take the maximum of them to get the path norm. We count the possible paths, that is, the paths that satisfy (6), as below:

Let P(ij) denote the number of possible paths from \( (i,j)^{th} \) entry of the matrix A then

$$\begin{aligned} P(i,j) = {\left\{ \begin{array}{ll} 1, &{} \begin{aligned} \text {if } 1\le i \le m \\ \text { and }j = n \end{aligned} \\ 1, &{} \begin{aligned} \text {if } 1\le j \le n \\ \text { and }i = m \end{aligned}\\ \begin{aligned} P(i+1,j)\qquad \qquad \\ + P(i+1,j+1), \end{aligned} &{} \begin{aligned} \text {if } 1\le i \le m-1 \\ \text { and } 1\le j \le n-1 \end{aligned} \end{array}\right. } \end{aligned}$$
(7)

Let T(mn) denote the total number of possible paths in matrix A. Then,

$$\begin{aligned} \begin{aligned} T(m,n)&= P(1,1) + P(1,2)+ \dots + P(1,n)\\&= \sum _{t=1}^{n} P(1,t). \end{aligned} \end{aligned}$$
(8)

The relation (8) can recursively be defined as follows:

Denote A(mn) for the submatrix of order \((m-1) \times (n-1)\) by removing \( m^{th} \) row and \( n^{th} \) column of matrix A.

Each entry in the last row of A(mn) further induces two possible paths for the entries in the \( m^{th} \) row of matrix A, giving \( 2 \times T(m-1,n-1) \) where \( T(m-1,n-1) \) is the total number of possible paths in A(mn) .

The number of paths reaching last entry of submatrix A(m; ) is the difference between number of all paths in A(m; ) and number of all paths in A(mn) , that is, \( T(m-1,n) - T(m-1,n-1) \) and paths reaching the last entry of submatrix A(m; ) gives rise to one path each for the matrix A. Hence, \( 1 \times (T(m-1,n) - T(m-1,n-1)) \).

Thus, we have

$$\begin{aligned} \begin{aligned} T(m,n)&= 2 \times T(m-1,n-1) \\&\quad + 1 \times (T(m-1,n) - T(m-1,n-1))\\&= T(m-1,n-1) + T(m-1,n). \end{aligned} \end{aligned}$$

The recursive relation is given by

$$\begin{aligned} T(m,n) = {\left\{ \begin{array}{ll} 1, &{} \text {if } n=1\\ n, &{} \text {if } m=1\\ T(m-1,n-1) + T(m-1,n), &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$
(9)

We show that the brute-force method takes exponential time. We use induction on m to prove that \( T(m,n) = 2^{m-2}(2n - m + 1) \).

Let n be fixed.

Base case:

$$\begin{aligned} T(2,n)&= T(1,n-1) + T(1,n)\\&= (n-1) + n\\&= 2n -1. \end{aligned}$$

For \( m=2 \), we have

$$\begin{aligned} T(2,n) = 2^{2-2}(2n - 2 + 1) = 2^{0} (2n-1) = 2n-1. \end{aligned}$$

Assume that the condition is true for the case \( m-1 \), that is, \( T(m-1,n) = 2^{m-3}(2n - m + 2) \).

Now,

$$\begin{aligned} T(m,n)&= T(m-1,n-1) + T(m-1,n)\\&= 2^{(m-1)-2}(2(n-1) - (m-1) + 1)\\&\qquad + 2^{m-3}(2n - m + 2)\\&= 2^{m-3}(2n-2 -m +1 +1 + 2n -m + 2)\\&= 2^{m-3}(4n -2m + 2)\\&= 2^{m-2}(2n - m +1). \end{aligned}$$

Therefore, it is clear that the brute-force way of finding all possible paths takes exponential time. Hence, computing the binary row path norm takes \( O(2^m n) \) running time. \(\square \)

In the dynamic programming technique, we solve a problem by combining solutions to sub-problems. We propose dynamic programming algorithms for computing the path norms. The dynamic programming algorithms presented in this paper take quadratic running time in contrast with brute-force or recursive methods, which take exponential running time.

We begin with the algorithm to compute the binary row path norm of a matrix.

figure b

Time and space complexity of the Algorithm 4.1

The basic operation involved in the algorithm is comparison of norms that occurs in Line 9. The for loop from 3 to Line 22 involves \((m-1)(n-1)\) comparisons in total. Therefore, the time complexity of the Algorithm 4.1 is \((m-1)(n-1) = mn-m-n+1 = O(mn)\). Assuming \( m=O(n) \), the time complexity of the algorithm is \( O(n^2)\), that is, quadratic running time. The input matrix A and the matrix

colIndicesMatrix takes O(mn) space each, and the array dynamicRowValuesList takes O(n) space. Line 23 to Line 28 takes O(m) space. Therefore, the space complexity is given by O(mn). Assuming \( m=O(n) \), the space complexity of the algorithm is \( O(n^2)\).

We give examples of computing the binary row path norm using Algorithm 4.1 in Example 4.4, Example 4.5 and Example 4.6.

Example 4.4

Consider the matrix with real entries,

$$\begin{aligned} A = \begin{bmatrix} -49 &{} -13 &{} 23 &{} -133 \\ -71 &{} 42 &{} 68 &{} -41 \\ -34 &{} -102 &{} 18 &{} -6 \end{bmatrix}. \end{aligned}$$
figure c

The binary row path norm \( N_b(A) = 222 \) along the path \( -49, -71, -102 \).

Example 4.5

Consider the matrix A with complex entries.

$$\begin{aligned} A = \begin{bmatrix} -82+84i &{} -110+128i \\ -15+104i &{} 63+35i \\ -101-129i &{} -44+78i\\ -52-72i &{} 16+93i\\ 32-113i &{} 150+136i \end{bmatrix} \end{aligned}$$
figure d

Using Euclidean norm on complex numbers, the binary row path norm \( N_b(A) = 683.1392 \) along the path \( -82+84i,\; -15+104i,\; -101-129i,\; 16+93i,\; 150+136i \).

Example 4.6

Consider the matrix A with complex entries.

$$\begin{aligned} A = \begin{bmatrix} -82+84i &{} -110+128i \\ -15+104i &{} 63+35i \\ -101-129i &{} -44+78i\\ -52-72i &{} 16+93i\\ 32-113i &{} 150+136i \end{bmatrix} \end{aligned}$$
figure e

Using Manhattan norm on complex numbers, the binary row path norm \( N_b(A) = 925 \) along the path \( -82+84i,\; -15+104i,\; -101-129i,\; -52-72i,\; 150+136i \).

Table 1 Different binary path norms on a matrix

In Example 4.5 and Example 4.6, we have used Euclidean norm and Manhattan norm, respectively, for the given matrix. However, one can use any norm on the ring to compute binary row path norm.

4.1 Different binary path norms on a matrix

The binary row path norms can be redefined by changing the direction of comparison to find paths. In that case, we get eight different types of norms for the same matrix; four are row path norms, and four are column path norms. They are illustrated from 1 to 8:

figure f

We denote the dihedral group of symmetries of a square by \( D_8 \). The suffix \( \perp \) represents the rotation of the matrix by \( 90^{\circ } \) in the anticlockwise direction, the suffixes X and Y represent the reflections of the matrix about \( X- \)axis and \( Y- \)axis, respectively. An action of a group G on a set X is said to be faithful if there are no group elements g, except the identity element, such that \( gx=x \), for all x in X. Table 1 gives the faithful group action of \( D_8 \) on the set of different binary path norms. As the orbit of every binary path norm under this group action contains all the binary path norms, it is possible to obtain all binary path norms of a matrix by computing any one binary path norm on the matrices obtained by different symmetries of the matrix. The path norms from 1 to 8 computed in Sect. 4.1 can be obtained by computing \( R \downarrow R \) after applying rotations and reflections to the given matrix.

Proposition 4.7

The following binary path norms coincide:

  1. (i)

    \( \varvec{R\downarrow R} \) and \( \varvec{R\uparrow L} \).

  2. (ii)

    \( \varvec{R\downarrow L} \) and \( \varvec{R\uparrow R} \).

  3. (iii)

    \( \varvec{C \rightarrow D} \) and \( \varvec{C \leftarrow U} \).

  4. (iv)

    \( \varvec{C \rightarrow U} \) and \( \varvec{C \leftarrow D} \).

Proof

Let A be an \( m \times n \) matrix with entries from normed ring R.

  1. (i)

    Note that,

    $$\begin{aligned} R\downarrow R \equiv \max \bigg \{\sum _{i=1}^{m}\mid&a_{ij_i} \mid \; : \text { satisfying } (6) \bigg \}. \end{aligned}$$

    We have

    $$\begin{aligned} R\uparrow L \equiv \max \bigg \{\mid a_{mj_m}&\mid + \mid a_{(m-1)j_{m-1}} \mid + \cdots \\&+ \mid a_{2j_2}\mid +\mid a_{1j_1} \mid \; : \;\\ {}&1 \le j_1,j_2,\dots ,j_m \le n , \\ {}&j_s = j_{s+1} \text { or }j_s = j_{s+1}-1, \\&\text { for } 1 \le s \le m-1\bigg \}\\ = \max \bigg \{\sum _{i=1}^{m}\mid a_{ij_i}&\mid \; : \;1 \le j_1,j_2,\dots ,j_m \le n , \\ {}&j_{s+1} = j_{s} \text { or }j_{s+1} = j_{s}+1, \\&\text { for } 2 \le s+1 \le m\bigg \}\\ = \max \bigg \{\sum _{i=1}^{m}\mid a_{ij_i}&\mid \; : \; 1 \le j_1,j_2,\dots ,j_m \le n , \\ {}&j_{k} = j_{k-1} \text { or }j_{k} = j_{k-1}+1, \\&\text { for } 2 \le k \le m\bigg \}\\ \equiv R \downarrow R. \end{aligned}$$

    The proofs of (ii), (iii) and (iv) are similar.

\(\square \)

From Proposition 4.7, we have Table 1 reduced to Table 2.

Proposition 4.8

For any matrix A with real non-negative entries,

$$\begin{aligned} N_b(A) = n_b(A). \end{aligned}$$

Proof

Let A be a non-negative matrix. Then, \( a_{ij}\ge 0 \), for all ij.

By Definition 3.1, we have

$$\begin{aligned} n_b(A) =&\max \left\{ \mid \sum _{i=1}^{m} a_{ij_i} \mid \; :\; \text {satisfying} (6) \right\} \\ =&\max \left\{ \sum _{i=1}^{m}\mid a_{ij_i} \mid \; :\; \text {satisfying} (6) \right\} \\ =&N_b(A). \end{aligned}$$

However, the converse of the Proposition 4.8 is not true.

Consider the matrix .

Clearly, \( N_b(A) = n_b(A) = 4 \) and the matrix entries are not non-negative real numbers.

5 Binary wrapping path seminorm and norm

The concept of wrapping is useful in dealing with special matrices like circulant matrices. The path seminorm(norm) can be extended to wrapping path seminorm(norm) by wrapping the last column of the matrix around the first column to form a cylinder-like structure.

Table 2 Four distinct binary path norms

Definition 5.1

We define \( n_{bw}: M_{m\times n} (R) \rightarrow \mathbb {R} \) as follows:

$$\begin{aligned} n_{bw}(A)= \max&\bigg \{\mid \sum _{i=1}^{m} a_{ij_i} \mid \; : \text { satisfying } (10)\bigg \}, \end{aligned}$$

where (10) is given by

$$\begin{aligned} \left. \begin{array}{rl} \quad &{}1 \le j_1,j_2,\dots ,j_m \le n\\ \quad &{}j_s = {\left\{ \begin{array}{ll} j_{s-1} \text { or } j_{s-1}+1, &{} \text {if } j_{s-1} \le n-1\\ 1 \text { or } n, &{} \text {if } j_{s-1} = n, \end{array}\right. }\\ \quad &{}\text {for } 2 \le s \le m \end{array} \quad \right\} \end{aligned}$$
(10)

By Proposition 3.2, \( n_{bw} \) is a seminorm on \( M_{m\times n} (R) \). We call \( n_{bw} \) as binary wrapping row path seminorm.

Definition 5.2

We define \( N_{bw}: M_{m\times n} (R) \rightarrow \mathbb {R} \) as follows:

$$\begin{aligned} N_{bw}(A)=&\max \bigg \{\sum _{i=1}^{m}\mid a_{ij_i} \mid \; : \text { satisfying } (10)\bigg \}. \end{aligned}$$

By Proposition 4.2, \( N_{bw} \) is a norm on \( M_{m\times n} (R) \). We call \( N_{bw} \) as binary wrapping row path norm.

From Proposition 4.3, it is clear that the time complexity of computing the binary wrapping row path norm by brute-force method is \( O(2^m n) \).

The algorithm to compute binary wrapping row path norm of a matrix is obtained by replacing Line 17 to Line 19 of Algorithm 4.1 by Line 1 to Line 9 of Algorithm 5.1.

figure g

From the analysis of Algorithm 4.1, it is clear that Algorithm 5.1 has quadratic running time.

Example 5.3

Consider the matrix

$$\begin{aligned} A = \begin{bmatrix} -49 &{} -13 &{} 23 &{} -133 \\ -71 &{} 42 &{} 68 &{} -41 \\ -34 &{} -102 &{} 18 &{} -6 \end{bmatrix}. \end{aligned}$$

Using Algorithm 5.1, the binary wrapping path norm of a matrix is given by

figure h

The binary wrapping row path norm \( N_{bw}(A) = 306 \) along the path \( -133, -71, -102 \).

From Example 4.4 and Example 5.3, we see that the binary row path norm and binary wrapping row path norm need not be the same for a matrix.

Now, we use the concept of circulant matrices to show the necessity of binary wrapping row path norm.

Proposition 5.4

Circulant matrices \( cir(a_1,a_2,\dots , a_n),\)\( cir(a_2,a_3,\dots , a_n, a_1), \) \(\dots , cir(a_n, a_1,\dots ,, a_{n-1}) \) have the same binary wrapping row path norm.

Proof

Let \( cir(a_{p},a_{p+1},\dots ,a_n,a_1,\dots ,a_{p-1}) \) and

\( cir(a_{p+1},a_{p+2},\dots ,a_n,a_1,\dots ,a_{p}) \) be circulant matrices. We observe that \(cir(a_{p+1},a_{p+2},\dots ,a_n,a_1,\dots ,a_{p}) \) is obtained by shifting the first column of \( cir(a_{p},a_{p+1},\dots ,a_n,a_1,\dots ,a_{p-1}) \) to the last column.

Hence, the paths that are involved in computing binary wrapping path norm of \( cir(a_{p},a_{p+1},\dots ,a_n,a_1,\dots ,a_{p-1}) \) are also shifted to left by one column. Since the paths involved in both the matrices are exactly same, the binary wrapping path norms are also same. \(\square \)

Corollary 5.5

The binary row path norm of circulant matrices \( cir(a_1,a_2,\dots , a_n),\) \( cir(a_2,a_3,\dots , a_n, a_1), \) \(\dots , cir(a_n, a_1,\dots ,, a_{n-1}) \) need not be same.

6 Ternary path seminorm on a matrix

Definition 6.1

We define \( n_t: M_{m\times n} (R) \rightarrow \mathbb {R} \) as follows:

$$\begin{aligned} n_t(A)= \max \bigg \{&\mid \sum _{i=1}^{m} a_{ij_i} \mid \; : \;1 \le j_1,j_2,\dots ,j_m \le n , \\ {}&j_s =j_{s-1}-1 \text { or } j_{s-1} \text { or } j_{s-1}+1,\\ {}&\text { for } 2 \le s \le m-1\bigg \}. \end{aligned}$$

We call the conditions on \( j_i \)’s as

$$\begin{aligned} \left. \begin{array}{rl} \quad &{}1 \le j_1,j_2,\dots ,j_m \le n\\ \quad &{}j_s =j_{s-1}-1 \text { or } j_{s-1} \text { or } j_{s-1}+1\\ \quad &{}\text {for } 2 \le s \le m-1 \end{array} \right\} \end{aligned}$$
(11)

Proposition 6.2

For \( A,B \in M_{m\times n}(R) \), \( n_t \) satisfies the following properties.

  1. (i)

    \(n_t(A)\ge 0\) (non-negativity);

  2. (ii)

    \( n_t(cA) \le \; \mid c\mid n_t(A) \), for any c in R (weak absolute homogeneity);

  3. (iii)

    \(n_t(A+B)\le n_t(A)+n_t(B)\) (triangular inequality);

  4. (iv)

    \(n_t\) is a seminorm on \( M_{m\times n}(R) \).

Proof

The proof is similar to that of Proposition 3.2.\(\square \)

We call \( n_t \) as ternary row path seminorm.

Proposition 6.3

Let \( A \in M_{m\times n}(R) \). Then,

$$\begin{aligned} n_t(A) \ge n_b(A). \end{aligned}$$

Proof

By Definition 6.1, we have

$$\begin{aligned} n_t(A)= \max \bigg \{&\mid \sum _{i=1}^{m} a_{ij_i} \mid \; : \; \textrm{satisfying} \, \bigg \} \\ = \max \bigg \{&\mid \sum _{i=1}^{m} a_{ij_i} \mid \; : \;1 \le j_1,j_2,\dots ,j_m \le n , \\ {}&j_s = j_{s-1} \text { or } j_{s-1}+1, \text { for } 2 \le s \le m-1\bigg \} \\+ \max \bigg \{&\mid \sum _{i=1}^{m} a_{ij_i} \mid \; : \;1 \le j_1,j_2,\dots ,j_m \le n , \\ {}&j_s = j_{s-1}-1, \text { for } 2 \le s \le m-1\bigg \} \\ = n_b(A)&+ \max \bigg \{\mid \sum _{i=1}^{m} a_{ij_i} \mid \; : \;1 \le j_1,j_2,\dots ,j_m \le n , \\ {}&j_s = j_{s-1}-1, \text { for } 2 \le s \le m-1\bigg \}\\ \ge n_b(A). \end{aligned}$$

\(\square \)

7 Ternary path norm on a matrix

Definition 7.1

We define \( N_t: M_{m\times n} (R) \rightarrow \mathbb {R} \) as follows:

$$\begin{aligned} N_t(A)= \max&\bigg \{\sum _{i=1}^{m}\mid a_{ij_i} \mid \; : \; \text {satisfying } \bigg \}. \end{aligned}$$

We denote this way of computing \( N_t \) as \( R\downarrow LR \).

Proposition 7.2

For \( A,B \in M_{m\times n}(R) \), \( N_t \) satisfies the following properties.

  1. (i)

    \(N_t(A)\ge 0\) (non-negativity);

  2. (ii)

    \(N_t(A)=0\) if and only if \(A=0\) (definiteness);

  3. (iii)

    \( N_t(cA) \le \; \mid c\mid N_t(A) \), for any c in R (weak absolute homogeneity);

  4. (iv)

    \(N_t(A+B)\le N_t(A)+N_t(B)\) (triangular inequality);

  5. (v)

    \(N_t(Ax)\le N_t(A)N_t(x)\) for any column vector x (compatibility with a vector norm);

  6. (vi)

    \(N_t\) is a norm on \( M_{m\times n}(R) \).

Proof

The proof is similar to that of Proposition 4.2.\(\square \)

We call \( N_t \) as ternary row path norm.

Proposition 7.3

For any matrix A with real non-negative entries,

$$\begin{aligned} N_t(A) = n_t(A). \end{aligned}$$

Proof

The proof is similar to that of Proposition 4.8. \(\square \)

The paths involved in computing ternary row path norm are of this form:

figure i

Proposition 7.4

The brute-force method of computing ternary row path norm takes (\( O(3^m n) \)) running time.

Proof

The proof is similar to that of Proposition 4.3. \(\square \)

Since the brute-force method of computing the ternary row path norm takes exponential time, we propose a quadratic running time algorithm for computing the ternary row path norm of a matrix.

figure j

Time and space complexity of the Algorithm 7.1

The basic operation involved in the algorithm is comparison of norms that occurs in Line 19 and Line 22. The for loop from Line 3 to Line 40 involves \((m-1)(2(n-2)+2)\) comparisons in total. Therefore, the time complexity of the algorithm is \((m-1)(2(n-2)+2) = 2mn-2\,m-2n+2 = O(mn)\). Assuming \( m=O(n) \), the time complexity of the a algorithm is \( O(n^2)\), that is, quadratic running time.

The input matrix A and the matrix

colIndicesMatrix take O(mn) space each. The array dynamicRowValuesList takes O(n) space. From Line 41 to Line 46, the algorithm takes O(m) space. Therefore, the space complexity is given by O(mn). Assuming \( m=O(n) \), the space complexity of the algorithm is \( O(n^2)\).

We now give an example for computing ternary path norm of the matrix with real entries in Example 7.5.

Example 7.5

Let \( A = \begin{bmatrix} -49 &{} -13 &{} 23 &{} -133 \\ -71 &{} 42 &{} 68 &{} -41 \\ -34 &{} -102 &{} 18 &{} -6 \end{bmatrix}.\)

figure k

The ternary row path norm \( N_t(A) = 303 \) along the path \( -133, 68, -102 \).

From Example 4.4 and Example 7.5, we observe that the ternary row path norm is different from the binary row path norm.

Example 7.6

Consider the matrix A with complex entries.

$$\begin{aligned} A = \begin{bmatrix} -82+84i &{} -110+128i \\ -15+104i &{} 63+35i \\ -101-129i &{} -44+78i\\ -52-72i &{} 16+93i\\ 32-113i &{} 150+136i \end{bmatrix} \end{aligned}$$

Using Euclidean norm on complex numbers, the ternary row path norm \( N_b(A) = 734.5230 \) along the path \( -110+128i, -15+104i, -101-129i, 16+93i, 150+136i\).

Example 7.7

Consider the matrix A with complex entries.

$$\begin{aligned} A = \begin{bmatrix} -82+84i &{} -110+128i \\ -15+104i &{} 63+35i \\ -101-129i &{} -44+78i\\ -52-72i &{} 16+93i\\ 32-113i &{} 150+136i \end{bmatrix} \end{aligned}$$
figure l

Using Manhattan norm on complex numbers, the ternary row path norm \( N_b(A) = 997 \) along the path \( -110+128i, -15+104i, -101-129i, -52-72i, 150+136i \).

Proposition 7.8

Let \( A \in M_{m\times n}(R) \). Then,

$$\begin{aligned} N_t(A) \ge N_b(A). \end{aligned}$$

Proof

The proof is similar to that of Proposition 6.3.\(\square \)

As in the case of binary path norms, we have four different ternary path norms \( R\downarrow LR, R\uparrow LR,\) \( C \rightarrow UD \) and \( C \leftarrow UD \). By Proposition 4.7, the group action of \( D_8 \) on \( \{R\downarrow LR,\; R\uparrow LR,\; C \rightarrow UD, \; C \leftarrow UD\} \) reduces to Table 3.

Table 3 Two distinct ternary path norms

8 Ternary wrapping path seminorm and norm

As in the case of the binary path seminorm, we define the ternary wrapping path seminorm as

Definition 8.1

We define \( n_{tw}: M_{m\times n} (R) \rightarrow \mathbb {R} \) as follows:

$$\begin{aligned} n_{tw}(A)=&\max \bigg \{\mid \sum _{i=1}^{m} a_{ij_i} \mid \; : \text { satisfying } (12)\bigg \}. \end{aligned}$$

where (12) is given by

$$\begin{aligned} \left. \begin{array}{rl} \quad &{}1 \le j_1,j_2,\dots ,j_m \le n\\ \quad &{}j_s = {\left\{ \begin{array}{ll} n \text { or } 1 \text { or } 2, &{} \text {if } j_{s-1} = 1,\\ \begin{aligned} &{}j_{s-1}-1 \\ &{}\text { or } j_{s-1} \\ &{}\text { or } j_{s-1}+1, \end{aligned} &{} \text {if } 2 \le j_{s-1} \le n-1\\ n-1 \text { or } n \text { or } 1, &{} \text {if } j_{s-1} = n, \end{array}\right. }\\ \quad &{}\text {for } 2 \le s \le m-1 \end{array} \right\} \end{aligned}$$
(12)

From Proposition 6.2, \( n_{tw} \) is a path seminorm. We call \( n_{tw} \) as ternary wrapping row path seminorm.

The ternary wrapping row path norm is defined as

Definition 8.2

We define \( N_{tw}: M_{m\times n} (R) \rightarrow \mathbb {R} \) as follows:

$$\begin{aligned} N_{tw}(A)=&\max \bigg \{\sum _{i=1}^{m}\mid a_{ij_i} \mid \; : \text { satisfying } (12)\bigg \}. \end{aligned}$$

From Proposition 7.2, \( N_{tw} \) is a path norm, and from Proposition 7.4, it is clear that the time complexity of the brute-force method of computing the ternary wrapping row path norm is \( O(3^m n) \).

The algorithm for computing ternary wrapping row path norm is obtained by replacing Line 6 to Line 14 of Algorithm 7.1 by Line 1 to Line 13 of Algorithm 8.1, and Line 30 to Line 38 of Algorithm 7.1 by Line 14 to Line 26 of Algorithm 8.1.

figure m

From the analysis of Algorithm 7.1, it is clear that the time complexity of computing the ternary wrapping row path norm is quadratic.

9 Path norm of Church numerals

Church numerals are \( \lambda - \)terms that represent non-negative integers under Church encoding. An \( n^{th} \) numeral is a function that maps any function f to its \( n- \)fold composition, that is, the function f is composed n-times with itself.

Every \( \lambda - \)term can be expressed as graphs using Böhm trees. The Böhm trees of the Church numeral \( \overline{0} \) is a tree with single node with label \( \lambda f x. x \), that is,

figure n

We compute path norm of Church numerals using path norms of adjacency matrices of these Böhm trees. We define path norm of Church numerals formally as follows:

Definition 9.1

Let C denote the set of all Church Numerals. We define a function \( f: C \rightarrow \mathbb {R} \) as the binary row path norm of the adjacency matrix of Böhm tree of the Church numeral, that is,

$$\begin{aligned} f(\overline{n}) = N_b(A) \end{aligned}$$

where A is the adjacency matrix of \( BT(\overline{n}) \).

Proposition 9.2

For any Church numeral \( \overline{n} \),

$$\begin{aligned} f(\overline{n}) = n \end{aligned}$$

where \( n \in \mathbb {N} \), the natural number equivalent to \( \overline{n} \).

Proof

Let \( \overline{n} \in C \). The \( BT(\overline{n}) \) is given by (13).

Vertices of \( BT(\overline{n}) \) are relabeled as

figure o

Then, its adjacency matrix is

$$\begin{aligned} A = \begin{bmatrix} 0 &{} 1&{} 0 &{}0 &{} \cdot &{} \cdot &{} \cdot &{} 0 &{} 0\\ 1 &{} 0&{} 1 &{}0 &{} \cdot &{} \cdot &{} \cdot &{} 0 &{} 0\\ 0 &{} 1&{} 0 &{}1 &{} \cdot &{} \cdot &{} \cdot &{} 0 &{} 0\\ 0 &{} 0&{} 1 &{}0 &{} \cdot &{} \cdot &{} \cdot &{} 0 &{} 0\\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{}\vdots &{} \vdots \\ 0 &{} 0&{} 0 &{}0 &{} \cdot &{} \cdot &{} \cdot &{} 0 &{} 1 \\ 0 &{} 0&{} 0 &{}0 &{} \cdot &{} \cdot &{} \cdot &{} 1 &{} 0\\ \end{bmatrix} \end{aligned}$$

The path norm \( N_b(A) = n \). Therefore, \( f(\overline{n}) = n. \) \(\square \)

Proposition 9.3

Let \( \overline{m}, \overline{n} \) be Church numerals. Then, the function f is a norm, that is, it satisfies the following properties:

  1. (i)

    \(f(\overline{n}) \ge 0\) (non-negativity);

  2. (ii)

    \(f(\overline{n})=0\) if and only if \(\overline{n}=\overline{0}\) (definiteness);

  3. (iii)

    \( f(\overline{cn}) = \mid c\mid f(\overline{n}) \), for any c in \( \mathbb {N} \) (absolute homogeneity);

  4. (iv)

    \(f(\overline{m} + \overline{n})= f(\overline{m}) + f(\overline{n}) \) (strict triangular inequality);

  5. (v)

    \(f(\overline{m} *\overline{n})= f(\overline{m}) \times f(\overline{n}) \) (strict submultiplicativity).

Proof

Let \( \overline{m}, \overline{n} \in C \).

  1. (i)

    We know that \( \overline{n}\ge 0 \). By Proposition 9.2, \(f(\overline{n}) = N_b(A) = n \ge 0\).

  2. (ii)

    Suppose \( \overline{n}=\overline{0} \). Then, \(f(\overline{0}) = N_b(\left[ 0\right] ) = 0 \). Conversely, suppose that \( f(\overline{n})=0 \). Then, \( N_b(A) = 0 \implies \) \( A = \left[ 0\right] \), a zero matrix, \( \implies \overline{n}=\overline{0}\).

  3. (iii)

    Let \( c \in \mathbb {N} \). By Proposition 9.2, \( f(\overline{cn}) = cn = c f(\overline{n}) = \mid c\mid f(\overline{n}) \).

  4. (iv)

    From (2), we have \( f(\overline{m} + \overline{n}) = f(\overline{m+n}) = m+n = f(\overline{m}) + f(\overline{n})\).

  5. (v)

    From (3), we have \( f(\overline{m} *\overline{n}) = f(\overline{m\times n}) = m\times n = f(\overline{m}) \times f(\overline{n})\).\(\square \)

We call this f as the strict path norm of a Church numeral.

The adjacency matrix of \( BT(\overline{n}) \) has column norm and row norm equal to 2. Due to this, the column norm and row norm cannot distinguish between any two different numerals. At the same time, the path norm of the Church numeral is the natural number equivalent to the Church numeral and can naturally distinguish distinct Church numerals.

9.1 Church Pairs

Let \( m = [m_p,m_n] \) be a Church pair. The \( \lambda \)-term of Church pair m is \( A \equiv \lambda z.z(\lambda fx.f^{m_p}x)(\lambda gy.g^{m_n} y) \). The Böhm Tree of BT(A) is given by

(14)

We define a matrix M on the BT(A) as

(15)

where is a zero matrix.

Definition 9.4

Let P denote the set of all Church Pairs. We define a function \( f_p: P \rightarrow \mathbb {R} \) as the binary row path norm of the matrix M of Böhm tree of the Church pair, that is, let \( m = [m_p,m_n] \) and A be its \( \lambda - \)term. Then,

$$\begin{aligned} f_p(m) = N_b(M) \end{aligned}$$

where M is the matrix of BT(A) as defined in (15).

Proposition 9.5

For any Church pair \( m = [m_p,m_n] \),

$$\begin{aligned} f_p(m) = m_p+m_n. \end{aligned}$$

Proof

Let \( m \in P \). The \( \lambda - \)term of m is \( A \equiv \lambda z.z(\lambda fx.f^{m_p}x) (\lambda gy.g^{m_n} y) \). The Böhm tree of BT(A) is given by (14)

Vertices of BT(A) are relabeled as

figure p

Then, its matrix M as defined in (15) is

The path norm \( N_b(M) = m_p+ m_n \).

Therefore, \( f_p(m) = m_p+ m_n. \) \(\square \)

Proposition 9.6

Let ab be Church pairs. Then, the function \( f_p \) satisfies the following properties:

  1. (i)

    \(f_p(a) \ge 0\) (non-negativity);

  2. (ii)

    \(f_p(a)=0\) if and only if \(a= [0,0]\) (definiteness);

  3. (iii)

    \( f_p(ca) = \; \mid c\mid f(a) \), for any c in \( \mathbb {N} \) (absolute homogeneity);

  4. (iv)

    \(f_p(a \oplus b)= f_p(a) + f_p(b) \) (strict triangular inequality);

  5. (v)

    \(f_p(a \circledast b)= f_p(a) \times f_p(b) \) (strict submultiplicativity).

Proof

Let \( a = [a_p,a_n], b = [b_p,b_n] \) be Church pairs.

  1. (i)

    We know that \( a_p,a_n \ge 0 \). By Proposition 9.5, \( f_p(a) = a_p+a_n \ge 0 \). Therefore, \( f_p(a) \ge 0 \).

  2. (ii)

    Suppose that \( a= [0,0] \). Then, \( f_p(a) = a_p+a_n = 0+0 = 0\). Conversely, suppose that \( f_p(a) = 0 \). Then, \( a_p+a_n = 0 \). Since \( a_p,a_n \ge 0 \), \( a_p=a_n = 0 \) \( \implies a = [0,0]\).

  3. (iii)

    Let \( c\in \mathbb {N} \). We have \( ca= [ca_p, ca_n] \). By Proposition 9.5, \( f_p(ca) = ca_p+ca_n = c(a_p + a_n) \) and \( cf_p(a) = c(a_p + a_n) \). Therefore, \( f_p(ca) = \mid c\mid f(a) \).

  4. (iv)

    From (4) and Proposition 9.5, we have \( f_p(a\oplus b) = f_p([a_p+b_p, a_n+b_n]) = a_p+b_p + a_n+b_n \) and \( f_p(a) + f_p(b) = a_p + a_n+b_p+b_n \). Therefore, \(f_p(a \oplus b)= f_p(a) + f_p(b) \).

  5. (v)

    From (2),(3),(5) and Proposition 9.5, we have \( f_p(a\circledast b) = f_p([a_p *b_p +a_n *b_n, a_p*b_n + a_n *b_p]) = f(a_p *b_p +a_n *b_n) + f(a_p*b_n + a_n *b_p) = a_p \times b_p +a_n \times b_n + a_p\times b_n + a_n \times b_p \) and \( f_p(a) \times f_p(b) =( a_p + a_n)\times (b_p+b_n) =a_p \times b_p +a_n \times b_n, a_p\times b_n + a_n \times b_p \). Therefore, \(f_p(a \circledast b)= f_p(a) \times f_p(b) \). \(\square \)

We call \( f_p \) as strict path norm on Church pairs.

10 Weighted path norms

Definition 10.1

Let \(\omega \ge 0;\) \( \epsilon > 0\). We define \(N_{\omega ,\epsilon }\) on a matrix A as

$$\begin{aligned} N_{\omega ,\epsilon }(A) = \omega \, N(A) + \epsilon \, \Vert A \Vert , \end{aligned}$$

where N is any path norm and \(\Vert \, \Vert \) is a compatible matrix norm.

Proposition 10.2

Let \( A,B \in M_{m\times n}(R) \). Then, \(N_{\omega ,\epsilon }\) satisfies the following properties:

  1. (i)

    \(N_{\omega ,\epsilon }(A)\ge 0\) (non-negativity);

  2. (ii)

    \(N_{\omega ,\epsilon }(A)=0\) if and only if \(A=0\) (definiteness);

  3. (iii)

    \( N_{\omega ,\epsilon }(cA) \le \; \mid c\mid N_{\omega ,\epsilon }(A) \) (weak absolute homogeneity);

  4. (iv)

    \(N_{\omega ,\epsilon }(A+B)\le N_{\omega ,\epsilon }(A)+N_{\omega ,\epsilon }(B)\) (triangular inequality);

  5. (v)

    \(N_{\omega ,\epsilon }(Ax)\le N_{\omega ,\epsilon }(A)N_{\omega ,\epsilon }(x)\) for any column vector x (compatibility with a vector norm);

  6. (vi)

    \(N_{\omega ,\epsilon }\) is a norm on \( M_{m\times n}(R) \).

Proof

The proofs are similar to that of Proposition 4.2. \(\square \)

We call \(N_{\omega ,\epsilon }\) as weighted path norm. If \( \omega +\epsilon =1, \) then we call \(N_{\omega ,\epsilon }\) as strictly weighted path norm.

10.1 Magic squares of order 3

The configurations of different solutions of \( 3 \times 3 \) magic squares with magic sum 15 are from 1 to 8. The arrows with R (C) give the row (column) path and the suffix involved is the column (row) number where the path begins. The direction of arrows gives the path traced.

  1. 1.

         

  2. 2.

         

  3. 3.

         

  4. 4.

         

  5. 5.

         

  1. 6.

         

  2. 7.

         

  3. 8.

         

We observe that path norm pairs distinguish the matrices.

We sort the magic squares in lexicographical order of \( R\downarrow L \) with respect to the column that gives the path and \( C\rightarrow U \) with respect to the row that gives the path.

Let the weights given to \( R\downarrow L \) norm be \( \omega \) and to \( C\rightarrow U \) norm be \( \epsilon \) as given in Table 4.

Table 4 Weights for the norms \( R\downarrow L \) and \( C \rightarrow U \)
Table 5 Strictly weighted path norm of all Magic Squares of order 3

In this case, there exists a weight function such that \( \omega +\epsilon = 1 \), yielding a strictly weighted path norm (refer Tables 4 and 5). However, this is not always the case. There are examples wherein, even after arranging the configurations in lexicographical order of path norms, there is no weight function satisfying \( \omega +\epsilon = 1 \). To illustrate this fact, we take the examples of Latin Squares in Sect. 10.2.

10.2 Latin squares of order 3

Given a Latin square of order 3, permuting its rows and columns, we get 12 distinct Latin squares configurations. The Latin squares of order 3 with their row and column path norms are given from 1 to 12:

  1. 1.

         

  2. 2.

         

  3. 3.

         

  4. 4.

         

  5. 5.

         

  1. 6.

         

  2. 7.

         

  3. 8.

         

  4. 9.

         

  5. 10.

         

  1. 11.

         

  2. 12.

         

The row norm and column norm of these matrices are 6. Using these norms, one won’t be able to distinguish the Latin squares. We use the weighted path norm to easily distinguish various configurations.

Let the weight function for the norm \( R\downarrow L \) be \( \omega \) and for norm \( C \rightarrow U \) be \( \epsilon \) as given in Table 6.

Table 6 Weights for the norms \( R\downarrow L \) and \( C \rightarrow U \)
Table 7 Weighted path norm of all Latin Squares of order 3

In this case, the weights of the path norms do not sum up to 1. In fact, there are no such weights corresponding to configurations in Table 7 such that they have sum 1 and the weighted norms are sorted.

11 Condition number

Let \( Ax=b \) be a linear system and let N be any path norm. Then, we have

$$\begin{aligned} N(b) = N(Ax). \end{aligned}$$

As \( N(Ax) \le N(A) N(x) \) (refer Proposition 4.2 and Proposition 7.2), we have

$$\begin{aligned} N(b)&\le N(A) N(x). \nonumber \\ \implies \frac{1}{N(x)}&\le \frac{N(A)}{N(b)} \end{aligned}$$
(16)

Let \( \delta b \) be a small change in b. Let \( \delta x \) be the corresponding change in the solution x. Then, \( A(x+\delta x) = b+\delta b \), that is, \( Ax+A\delta x = b+\delta b \). This gives \( A\delta x = \delta b \), which implies \( \delta x = A^{-1}\delta b \).

From Proposition 4.2 and Proposition 7.2, we have \( N(\delta x) = N(A^{-1} \delta b) \le N(A^{-1})N(\delta b) \). This implies

$$\begin{aligned} N(\delta x) \le N(A^{-1})N(\delta b) \end{aligned}$$
(17)

Multiplying equations (16) and (17), we get

$$\begin{aligned} \frac{N(\delta x)}{N(x)} \le N(A)N(A^{-1})\frac{N(\delta b)}{N(x)} \end{aligned}$$

The number \( \kappa _{N} = N(A)N(A^{-1}) \) is the condition number of A computed using path norms.

The condition number of a well-conditioned matrix is expected to have a lower value. On the other hand, the condition number of an ill-conditioned matrix is expected to be higher. We present examples wherein the condition number computed using the path norms meets these expectations compared to other standard norms.

Example 11.1

Consider the well-conditioned matrix

$$\begin{aligned} A = \begin{bmatrix} 0 &{} 0 &{} 0 &{} -1 \\ 0 &{} 0 &{} 2 &{} 0 \\ -2 &{} 0 &{} 1 &{} -1 \\ 2 &{} -2 &{} 0 &{} 1 \end{bmatrix}. \end{aligned}$$

Then,

$$\begin{aligned} A^{-1} = \begin{bmatrix} 0.5 &{} 0.25 &{} -0.5 &{} 0 \\ 0 &{} 0.25 &{} -0.5 &{} -0.5 \\ 0 &{} 0.5 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 \end{bmatrix}. \end{aligned}$$

Condition numbers based on the path norm, column norm, row norm and Euclidean norms are given in Table 8.

Table 8 Comparison of condition numbers based on different norms

Example 11.2

Consider the ill-conditioned matrix

$$\begin{aligned} A= \begin{bmatrix} 0.51 &{} -0.78 &{} -0.84\\ -0.24 &{} 1.67 &{} 0.38\\ -0.83 &{} 0.03 &{} 1.39 \end{bmatrix}. \end{aligned}$$

Then,

$$\begin{aligned} A^{-1}= \begin{bmatrix} 414.1832 &{} 189.8870 &{} 198.3862\\ 3.2634 &{} 2.0979 &{} 1.3986\\ 247.2476 &{} 113.3405 &{} 119.1501 \end{bmatrix}. \end{aligned}$$
Table 9 Comparison of condition numbers based on different norms

From Table 9, it is clear that the condition number of A computed using path norm is the largest as expected.

11.1 Conclusions

We have defined different versions of binary and ternary path norms and presented efficient algorithms to compute them. As the notion of path norm is defined on an arbitrary normed ring, one can compute path norms using different norms of the ring. We have illustrated this in Example 4.5 and Example 4.6 by computing path norms on a complex-valued matrix using both Euclidean and Manhattan norms, respectively. We have defined a path norm on Church numerals and extended it to Church pairs. These norms naturally distinguish the numerals. We have used path norms to order all the possible magic square solutions of size 3. Further, we have used weighted path norms for Latin squares of size 3 and ordered them. In the process, we have shown the benefits of path norms in (i) ordering solutions of magic squares and Latin squares; (ii) computing the condition number of a matrix. The concept of binary and ternary norms can be extended to n-ary norms on a matrix.

Matrix norms have applications in machine learning algorithms, especially to implement regularization (for example, see Kowalski (2009), Yildirim and Özkale (2022)). There is a scope to explore the use of path norms in machine learning algorithms to obtain models that are more robust.

The concept of three-dimensional matrix norm has applications in mathematical finance to deal with stochastic volatility (refer İzgi (2015), İzgi and Özkaya (2017)). Naturally, we can extend path norms to three-dimensional matrix path norms and investigate corresponding applications.