1 Introduction

In the last 20 years, the study of complex networks has permeated many areas of social and natural sciences. Important examples are computer, telecommunication, biological, cognitive, semantic and social networks. Network structures are usually modeled using graph theory to represent pairwise interactions between elements of a network. For this reason, it is particularly important to find advantageous ways to represent and compare graphs. Many graph distances have been proposed from both an applied and a theoretical perspective. In applications, the most common pseudo-distances are inspired by local comparison (e.g., Hamming distance, Jaccard distance) and/or global spectral methods. For an overview of the most commonly-used graph pseudo-distances to compare empirical networks in practice, see [11]. From a theoretical point of view, the selection of a metric on the space of graphs is related to graph limit theory. This is a very active field of mathematics that connects graph theory with many other mathematical areas such as stochastic processes, ergodic theory, spectral theory and several branches of analysis and topology. In fact, in mathematical terms, one is interested in finding a metric/topology on the space of graphs, and a completion of this space with respect to that topology. Traditionally, this field grew in two distinct directions: limits of graph sequences of bounded degree on the one hand (Benjamini–Schramm convergence [2] and the stronger notion of local–global convergence [3, 12]), and limits of dense graph sequences on the other hand (graphons and the related notion of cut-metric [9, 15, 17]). For a complete treatment of these topics, see the monograph by Lovász [16]. More recently, the challenging intermediate case of sparse graph sequences with unbounded degree attracted a lot of interest, as this case covers the vast majority of networks in applications. In fact, real networks are usually sparse, but not very sparse and heterogeneous. For instance, there is the relaxation of graphons to less dense graph sequences, namely \(L^p\)—graphons [7, 8]. In a recent paper, Backhausz and Szegedy introduced a new functional analytic/measure-theoretic notion of convergence [1], that not only covers the intermediate degree case, but also unifies the graph limit theories that we mentioned previously. Other works in this direction are [5, 6, 13, 14, 19].

In this paper, we contribute to the study of representation and comparison of graphs, by introducing and investigating the following measure-theoretic representation of matrices.

Definition 1.1

Let A be an \(n\times n\) matrix and, given \(\textbf{x}=(x_i)_i\in \mathbb {R}^n\), let

$$\begin{aligned} \mu ^{\textbf{p}}_{(A,\,\textbf{x})}:=\sum _{i=1}^n p_i\cdot \delta _{(x_i,(A\textbf{x})_i)}, \end{aligned}$$

where \(\delta \) denotes the Dirac measure, and \(p_1,\ldots ,p_n\in \mathbb {R}_{>0}\) such that \(\sum _{i=1}^n p_i=1\) are fixed. Let also \(\textbf{p}:=(p_i)_i\in \mathbb {R}^n\). We will simply denote by \(\mu _{(A,\,\textbf{x})}\) the measure \(\mu ^{\textbf{p}}_{(A,\,\textbf{x})}\), where there is no risk of confusion about the given probability vector \(\textbf{p}\).

The family of measures generated by A is

$$\begin{aligned} (\mu _{(A,\,\textbf{x})})_{\textbf{x}\in \mathbb {R}^n }. \end{aligned}$$

The set of measures generated by A is

$$\begin{aligned} \mathcal {Z}_A:=\{\mu _{(A,\,\textbf{x})}:\textbf{x}\in \mathbb {R}^n\}. \end{aligned}$$

We will drop the A in the subscript of \(\mu _{(A,\,\textbf{x})}\) and \(\mathcal {Z}_A\), writing \(\mu _\textbf{x}\) and \(\mathcal {Z}\), respectively, whenever the dependence on the matrix A is obvious.

We use this representation to define a new notion of pseudo-metric on the space of matrices. Moreover, we show that, for this pseudo-metric, the distance between two matrices A and B is zero if and only if A and B are switching equivalent, i.e., there exists a permutation matrix P such that \(A=PBP^\top \). Formally,

Theorem 1.2

Let \(K\in \mathbb {R}_{\ge 1}\) and let \(\mathcal {S}\) be the set of \(n\times n\) matrices A such that \(||A||_{\infty \rightarrow 1}\le K\). Then, a matrix \(A\in \mathcal {S}\) is determined, up to switching equivalence, by the set \(\mathcal {Z}_A\) of measures generated by A, where we consider the measures relative to the uniform probability measure.

For example, if two matrices A and B are the adjacency (or Laplacian) matrices of two graphs \(G_1\) and \(G_2\), respectively, then they are switching equivalent if and only if \(G_1\) and \(G_2\) are isomorphic. As a consequence, the above framework will allow us to define a metric on the class of isomorphic graphs.Additionally, we study how some properties of graphs, as the spectrum, the vertex degrees, and some homomorphism numbers, translate in this measure representation.

Such representation and metric are inspired by the notion of action convergence in graph limits theory [1], but it is simpler. For this reason, our analysis also contributes to a simpler understanding of the limit notion for graphs. Our results show that for discrete probability measures with finite support, such distance has the same expressive power of the more complex action convergence metric [1, Definition 2.6]. However, for general measures it stays open whether this metric has strictly less expressive power or not.

1.1 Structure of the paper

This work is structured as follows. In Sect. 2 we introduce some relevant definitions and notations, in Sect. 3 we give some preliminary results, and in Sect. 4 we prove Theorem 1.2. Moreover, in Sect. 5 we relate properties of matrices and graphs with the measure representation, and in Sect. 6 we underline the relationship with action convergence. Finally, in Sect. 7, we present some open questions as well as future directions.

2 Basic definitions and notations

Throughout the paper we fix \(n\in \mathbb {N}_{\ge 2}\), we let \(\mathcal {P}\) denote the set of permutation matrices of order n, and we let \(\mathbf {e_i}\) denote the i-th vector of the canonical basis of \(\mathbb {R}^n\), for \(i=1,\ldots ,n\). We use \(\delta \) to denote a Dirac measure and \(\mathcal {P}\left( \mathbb {R}^{p}\right) \) to represent the space of probability measures in \(\mathbb {R}^p\).

Definition 2.1

Let \(\rho \) be a probability measure on \(\mathbb {R}^2\) whose support is given by exactly n points. A pair of vectors \((\textbf{x},\textbf{y})\) with \(\textbf{x}=(x_i)_i, \textbf{y}=(y_i)_i\in \mathbb {R}^n\) is an ordered support of \(\rho \) if

$$\begin{aligned} x_1\le \ldots \le x_n \end{aligned}$$

and

$$\begin{aligned} \rho =\sum _{i=1}^n p_i\cdot \delta _{(x_i,y_i)}, \end{aligned}$$

for some \(p_1,\ldots ,p_n\in \mathbb {R}_{>0}\) such that \(\sum _{i=1}^n p_i=1\).

Observe that, in the above definition, if the entries of \(\textbf{x}\) are all pairwise distinct, then there is a unique ordered support of \(\rho \).

From here on, we also fix \(p_1,\ldots ,p_n\in \mathbb {R}_{>0}\) such that \(\sum _{i=1}^n p_i=1\), and we let \(\textbf{p}:=(p_i)_i\in \mathbb {R}^n\).

In the following, we will mainly focus on the case where \(\textbf{p}\) is the uniform distribution on [n]. However, we point out that, in some cases, other probability measures can be more appropriate. For example, if we consider the family of measures generated by the transition probability matrix of a reversible Markov chain, an appropriate choice for \(\textbf{p}\) would be the stationary distribution of the Markov chain.

Given an \(n\times n\) matrix A and a vector \(\textbf{x}=(x_i)_i\in \mathbb {R}^n\), we define the marginal with respect to the first variable as the discrete measure \(\mu ^{\textbf{p},M}_\textbf{x}\) such that

$$\begin{aligned} \mu ^{\textbf{p},M}_\textbf{x}(\{q_1\}):= & {} \sum _{q_2\in \mathbb {R}} \left( \sum _{i=1}^n p_i\cdot \delta _{(x_i,(A\textbf{x})_i)}(\{(q_1,q_2)\})\right) \\= & {} \sum _{q_2\in \{(A\textbf{x})_j: \ j\in [n]\}}\left( \sum _{i=1}^n p_i\cdot \delta _{(x_i,(A\textbf{x})_i)}(\{(q_1,q_2)\})\right) \end{aligned}$$

for every \(q_1\in \mathbb {R}\). Being the measure discrete, it is completely characterized by the previous formula. Also in this case, we will denote by \(\mu ^{M}_\textbf{x}\) the measure \(\mu ^{\textbf{p},M}_\textbf{x}\) when there is no risk for confusion. We also let

$$\begin{aligned} \mathcal {Z}_{(A,\,\textbf{x})}:=\mathcal {Z}_{\textbf{x}}:=&\left\{ \mu _{\textbf{y}}\in \mathcal {Z}: \mu ^M_{\textbf{y}}=\sum _{i=1}^n p_i\cdot \delta _{x_i}\right\} \\ =&\bigl \{\mu _{P\textbf{x}}\in \mathcal {Z}:\, P\in \mathcal {P} \text { and } P\textbf{p}=\textbf{p}\bigr \}. \end{aligned}$$

We observe that \(\mathcal {Z}_{\textbf{x}}\) has finitely many elements, since \(\mathcal {P}\) is a finite set, and that \(\mathcal {Z}_{\textbf{x}}=\mathcal {Z}_{P\textbf{x}}\) for all \(P\in \mathcal {P}\).

Example 2.2

As an easy example, assume that \(n=2\) and \(p_1=p_2=\frac{1}{2}\). Then, \(\mathcal {P}=\{{{\,\textrm{Id}\,}},P\}\), where

$$\begin{aligned} {{\,\textrm{Id}\,}}:=\begin{pmatrix}1 &{} 0\\ 0 &{} 1 \end{pmatrix} \quad \text {and}\quad P:=\begin{pmatrix}0 &{} 1\\ 1 &{} 0 \end{pmatrix}. \end{aligned}$$

Let also

$$\begin{aligned} A:=\begin{pmatrix}a_{11} &{} a_{12}\\ a_{21} &{} a_{22} \end{pmatrix}. \end{aligned}$$

Then,

$$\begin{aligned} \mu _{\mathbf {e_1}}=\mu _{{{\,\textrm{Id}\,}}\mathbf {e_1}}=\frac{1}{2}\cdot \bigl (\delta _{(1,a_{11})}+\delta _{(0,a_{21})}\bigr ) \end{aligned}$$

and

$$\begin{aligned} \mu _{\mathbf {e_2}}= \mu _{P\mathbf {e_1}}=\frac{1}{2}\cdot \bigl (\delta _{(0,a_{12})}+\delta _{(1,a_{22})}\bigr ). \end{aligned}$$

In particular,

$$\begin{aligned} \mathcal {Z}_{\mathbf {e_1}}=\mathcal {Z}_{\mathbf {e_2}}=\{\mu _{\mathbf {e_1}},\mu _{\mathbf {e_2}}\}. \end{aligned}$$

The following vectors will play an important role in the proof of our main result. Therefore, we give them a name.

Definition 2.3

Let A be an \(n\times n\) matrix and let \((\mu _\textbf{x})_{\textbf{x}\in \mathbb {R}^n }\) be the family of measures generated by A. A vector \(\textbf{v}\in \mathbb {R}^n\) is \(\textbf{p}-\)irreducible for A if the following condition holds. For every \(P_1\in \mathcal {P}\) and for every vector \(\textbf{y}\in \mathbb {R}^n\),

$$\begin{aligned} \mu ^{\textbf{p}}_{P_1\textbf{v}}=\mu ^{\textbf{p}}_{\textbf{y}} \end{aligned}$$

if and only if there exists \(P_2\in \mathcal {P}\) such that \(\textbf{y}=P_2P_1\textbf{v}\), \(P_2A=AP_2\) and \(P_2\textbf{p}=\textbf{p}\). We call it irreducible for A if it is \(\textbf{u}-\)irreducible for A, where \(\textbf{u}\) is the uniform n-dimensional probability vector.

Notice that, if a vector is irreducible for A, then it is \(\textbf{p}-\)irreducible for every probability vector \(\textbf{p}\). For this reason, we will just consider irreducible vectors for A in our following arguments.

We consider the following

Example 2.4

For the matrix

$$\begin{aligned} A=\begin{bmatrix} 0 &{} 2 \\ 3 &{} 1 \end{bmatrix}, \end{aligned}$$

the vector

$$\begin{aligned} \textbf{v}=\begin{bmatrix} 1\\ 0 \end{bmatrix} \end{aligned}$$

is irreducible, while the vector

$$\begin{aligned} \textbf{x}=\begin{bmatrix} -1\\ 1 \end{bmatrix} \end{aligned}$$

is not an irreducible vector.

We now want to be able to compare measures. For this reason, we recall the following well-known metric:

Definition 2.5

(Lévy–Prokhorov metric) The Lévy–Prokhorov Metric \(d_{\textrm{LP}}\) on the space of probability measures \(\mathcal {P}\left( \mathbb {R}^{p}\right) \) is

$$\begin{aligned} \begin{aligned} d_{\textrm{LP}}\left( \eta _{1}, \eta _{2}\right)&=\inf \left\{ \varepsilon >0: \eta _{1}(U) \le \eta _{2}\left( U^{\varepsilon }\right) +\varepsilon \quad \text {and} \right. \\&\quad \left. \eta _{2}(U) \le \eta _{1}\left( U^{\varepsilon }\right) +\varepsilon \text { for all } U \in \mathcal {B}_{p}\right\} , \end{aligned} \end{aligned}$$

where \(\mathcal {B}_{p}\) is the Borel \(\sigma \)-algebra on \(\mathbb {R}^{p}\) and \(U^{\varepsilon }\) is the set of points that have distance smaller than \(\varepsilon \) from U.

The above metric metrizes the weak/narrow convergence for measures. Now, we want to be able to compare sets of measures. We therefore introduce the following

Definition 2.6

(Hausdorff metric) Given \(X, Y\subset \mathcal {P}\left( \mathbb {R}^{p}\right) \), their Hausdorff distance

$$\begin{aligned} d_{H}(X, Y):=\max \left\{ \sup _{x \in X} \inf _{y \in Y} d_{\textrm{LP}}(x, y), \sup _{y \in Y} \inf _{x \in X} d_{\textrm{LP}}(x, y)\right\} \end{aligned}$$

Note that \(d_{H}(X, Y)=0\) if and only if \({\text {cl}}(X)={\text {cl}}(Y)\), where \({\text {cl}}\) is the closure in \(d_{\textrm{LP}}\). It follows that \(d_{H}\) is a pseudometric for all subsets in \(\mathcal {P}\left( \mathbb {R}^{k}\right) \), and it is a metric for closed sets.

By definition, the Lévy–Prokhorov distance between probability measures is upper-bounded by 1 and, therefore, the Hausdorff metric for sets of measures is upper-bounded by one.

Now, we define the \(L^\infty \) and the \(L^1\) norm of a vector \(\textbf{v}\in \mathbb {R}^n\) as

$$\begin{aligned} \left\| \textbf{v}\right\| _{\infty }:=\max _{i\in [n]}|v_i| \end{aligned}$$

and

$$\begin{aligned} \left\| \textbf{v}\right\| _{1}:=\sum _{i\in [n]}p_i|v_i|, \end{aligned}$$

respectively.

Additionally, we define the \((\infty \rightarrow 1)-\)operator norm of a \(n\times n\) matrix A as

$$\begin{aligned} \Vert A\Vert _{\infty \rightarrow 1}:=\sup _{ \textrm{v}\in \mathbb {R}^n, \begin{array}{c} \textrm{v}\ne 0 \end{array}}\frac{\Vert A\textrm{v}\Vert _{1} }{\Vert \textrm{v}\Vert _{\infty }}. \end{aligned}$$

3 Preliminary results

In this section we prove some preliminary results that will be needed for the proof of the main one.

Lemma 3.1

Let A be an \(n\times n\) matrix and let \((\mu _\textbf{x})_{\textbf{x}\in \mathbb {R}^n }\) be the family of measures generated by A. Then, \(\textbf{x}, \textbf{y}\in \mathbb {R}^n\) are such that

$$\begin{aligned} \mu _\textbf{x}=\mu _\textbf{y} \end{aligned}$$

if and only if there exists \(P\in \mathcal {P}\) such that

$$\begin{aligned} \textbf{y}=P\textbf{x}, \quad P\textbf{p}=\textbf{p} \quad \text {and }\quad \textbf{x}\in \ker (PA-AP). \end{aligned}$$

Proof

By definition, \(\mu _\textbf{x}=\mu _\textbf{y}\) if and only if

$$\begin{aligned} \sum _{i=1}^n p_i\cdot \delta _{(x_i,(A\textbf{x})_i)}=\sum _{i=1}^n p_i\cdot \delta _{(y_i,(A\textbf{y})_i)}, \end{aligned}$$

hence, if and only if there exists \(P\in \mathcal {P}\) such that \((P\textbf{x},PA\textbf{x})=(\textbf{y},A\textbf{y})\) and \(P\textbf{p}=\textbf{p}\). This happens if and only if \(P\textbf{p}=\textbf{p}\), \(\textbf{y}=P\textbf{x}\) and \(PA\textbf{x}=AP\textbf{x}\), hence, if and only if \(P\textbf{p}=\textbf{p}\), \(\textbf{y}=P\textbf{x}\) and \(x\in \ker (PA-AP)\). \(\square \)

An immediate consequence is the following

Corollary 3.2

Let A be an \(n\times n\) matrix and let \((\mu _\textbf{x})_{\textbf{x}\in \mathbb {R}^n }\) be the family of measures generated by A. For each \(P\in \mathcal {P}\) such that

$$\begin{aligned} PA=AP \quad \text {and} \quad P\textbf{p}=\textbf{p}, \end{aligned}$$

we have that, for all \(\textbf{x}\in \mathbb {R}^n\),

$$\begin{aligned} \mu _\textbf{x}=\mu _{P\textbf{x}}. \end{aligned}$$

We now prove a preliminary lemma that will be needed for the proof of the next theorem.

Lemma 3.3

Fix \(N\in \mathbb {N}\setminus \{0\}\). Given N \(n\times n\) non-zero matrices \(K_i\), for \(i\in [N]\), there exists a vector \(\textbf{v}\in \mathbb {R}^n\) such that \(\textbf{v}\notin \ker (K_i)\) for all \(i\in [N]\).

Proof

We prove the claim by induction over N. For \(N=1\), the claim is trivially true as the the matrix is non-zero.

We now assume the statement to be true for \(N-1\), and we prove it for N. From the inductive hypothesis, there exists \(\textbf{v}\in \mathbb {R}^n\) such that \(\textbf{v}\notin \ker (K_i)\) for every \(i\in [N-1]\). We also know that there exists \(\textbf{w}\in \mathbb {R}^n\) such that \(\textbf{w}\notin \ker (K_N)\) from the base case. Therefore, we can observe that we can choose an \(\alpha >0 \) such that the vector \(\textbf{v}+\alpha \textbf{w}\notin \ker (K_i)\) for every \(i\in [N]\). In fact, for every \(i\in [N]\), using linearity and the reverse triangular inequality we have

$$\begin{aligned} \Vert K_i(\textbf{v}+\alpha \textbf{w})\Vert \ge |\Vert K_i\textbf{v}\Vert -\alpha \Vert K_i\textbf{w}\Vert |. \end{aligned}$$

We now notice that \(K_i\textbf{v}\) and \(K_i\textbf{w}\) cannot both be zero for every \(i\in [N]\), as a consequence of the above discussion. We can therefore choose an \(\alpha \) such that the line passing trough the origin in the plane with slope \(\alpha \) does not intersect any of the points with coordinates \((\Vert K_i\textbf{v}\Vert ,\Vert K_i\textbf{w}\Vert )\). \(\square \)

We also notice that in the previous lemma, we could have considered, more generally, bounded operators on a normed vector space instead of only square matrices. More generally, we could also have considered countably many bounded operators, instead of finitely many, on a complete normed vector space. In fact, the proof idea of Lemma 3.3 rewritten in set theoretic language reads: the kernel of a non-zero operator is nowhere dense and, therefore, the countable union of the kernels is a set of first category, therefore its complement is dense in the complete normed vector space (i.e., non-empty) as a consequence of Baire category theorem.

The next theorem ensures the existence of irreducible vectors with pairwise distinct entries, for any given matrix.

Theorem 3.4

Let A be an \(n\times n\) matrix and let \((\mu _\textbf{x})_{\textbf{x}\in \mathbb {R}^n }\) be the family of measures generated by A. Then,

  1. 1.

    There exists an irreducible vector \(\textbf{v}=(v_i)_i\in \mathbb {R}^n\) for A.

  2. 2.

    We can always assume that the entries of \(\textbf{v}\) are pairwise distinct, that is, \(v_i\ne v_j\) for \(i\ne j\).

Proof

If \(PA=AP\) for every \(P\in \mathcal {P}\), then every vector is an irreducible vector, and there is nothing to show. Therefore, we consider the case in which there exists at least one \(P_2\in \mathcal {P}\) such that \(P_2A\ne AP_2\). For every \(P\in \mathcal {P}\) such that \(PA\ne AP\), there exists a non-zero vector \(\textbf{v}_{P}\notin \ker (PA-AP)\). Therefore, for every \(P_1\in \mathcal {P}\),

$$\begin{aligned} (P_1)^{\top }\textbf{v}_{P}\notin (P_1)^{\top }\ker (PA-AP)=\ker \bigl ((PA-AP)P_1\bigr ), \end{aligned}$$

where we are using the fact that \(P_1^{\top }=P_1^{-1}\), since this is a permutation matrix. Now, by Lemma 3.3 we can choose \(\textbf{v}\in \mathbb {R}^n\) such that

$$\begin{aligned} \textbf{v}\notin \ker \bigl ((PA-AP)P_1\bigr ) \end{aligned}$$

for all \(P\in \mathcal {P}\) such that \(PA\ne AP\) and \(P_1\in \mathcal {P}\), where we consider all matrices \((PA-AP)P_1\). Therefore, given \(P_1,P_2\in \mathcal {P}\), we have that

$$\begin{aligned} P_1\textbf{v}\in \ker (P_2A-AP_2) \iff P_2A=AP_2. \end{aligned}$$

Now, by Lemma 3.1, we have that \(\mu _{P_1\textbf{v}}=\mu _{\textbf{y}}\) if and only if there exists \(P_2\in \mathcal {P}\) such that \(\textbf{y}=P_2P_1\textbf{v}\) and \(P_1\textbf{v}\in \ker (P_2A-AP_2)\). Hence, by the above observation, \(\mu _{P_1\textbf{v}}=\mu _{\textbf{y}}\) if and only if there exists \(P_2\in \mathcal {P}\) such that \(\textbf{y}=P_2P_1\textbf{v}\) and \(P_2A=AP_2\). This proves the first claim.To prove the second claim, assume that \(v_i=v_j\) for some ij such that \(i\ne j\). Consider the vector

$$\begin{aligned} \mathbf {v'}:=\textbf{v}+\xi \mathbf {e_i}, \end{aligned}$$

where the absolute value of \(\xi \in \textbf{R}\) is small enough that

$$\begin{aligned} \mathbf {v'}\notin \ker ((PA-AP)P_1), \end{aligned}$$

for \(P,P_2\in \mathcal {P}\) such that \(PA\ne AP\). This is possible since the kernel is closed, therefore its complement is open. \(\square \)

From here on in this section, we fix a constant \(K\in \mathbb {R}_{\ge 1}\) and we let \(\mathcal {S}\) be the set of \(n\times n\) matrices A such that \(||A||_{\infty \rightarrow 1}\le K\). Moreover, given a vector \(\textbf{x}\in \mathbb {R}^n\) and \(d>0\), we let

$$\begin{aligned} \textbf{x}^i_{d}:=\textbf{x}+\frac{d^2}{64K}\mathbf {e_i}. \end{aligned}$$

We observe that, if \(\textbf{x}\) satisfies the conditions of Theorem 3.4, then also \(\textbf{x}^i_{d}\) satisfies them, for every \(d>0\) small enough.

Example 3.5

If \(K\ge (n-1)\), then \(\mathcal {S}\) contains all adjacency matrices associated to graphs on n nodes.

We now recall the following result from [1], since it will be needed in the proof of Lemma 3.7 below.

Lemma 3.6

(Lemma 13.1 in [1]) Let \(\tau (X-Y)\) denote the maximum of \(\mathbb {E}\left( \left| \pi _{i}(X-Y)\right| \right) \) over \(1 \le i \le k\), where \(\pi _{i}: \mathbb {R}^{k} \rightarrow \mathbb {R}\) is the i-th coordinate function for \(1 \le i \le k\). Let XY be two jointly distributed \(\mathbb {R}^{k}\)-valued random variables. Then,

$$\begin{aligned} d_{\textrm{LP}}(X_{\#}\mathbb {P},Y_{\#}\mathbb {P} ) \le \tau (X-Y)^{1 / 2} k^{3 / 4}. \end{aligned}$$

We apply the above lemma to prove the following

Lemma 3.7

Fix \(A\in \mathcal {S}\) and let \(\textbf{x}\in \mathbb {R}^n\). Let \(\nu _1\in \mathcal {Z}_{\textbf{x}}\) and write \(\nu _1=\mu _{P\textbf{x}}\) for some \(P\in \mathcal {P}\). Then, the measure

$$\begin{aligned} \nu _2:=\mu _{P(\textbf{x}_d^i)}\in \mathcal {Z}_{\textbf{x}^i_d} \end{aligned}$$

is such that

$$\begin{aligned} \mathrm {d_{LP}(\nu _1, \nu _2)}<\tfrac{d}{4}. \end{aligned}$$

Proof

We have that

$$\begin{aligned} \left\| P\textbf{x}-P(\textbf{x}_d^i)\right\| _1\le \frac{d^2}{64K}\left\| \mathbf {e_i}\right\| _1\le \frac{d^2}{64K}\le \frac{d^2}{64} \end{aligned}$$

and

$$\begin{aligned} \left\| AP\textbf{x}-AP(\textbf{x}_d^i)\right\| _1\le \left\| A\right\| _{\infty \rightarrow 1}\cdot \left\| P\textbf{x}-P(\textbf{x}_d^i)\right\| _{\infty }\le \left\| A\right\| _{\infty \rightarrow 1}\cdot \frac{d^2}{64K}\le \frac{d^2}{64}. \end{aligned}$$

The claim then follows by letting \(\mu _{P\textbf{x}}\) and \(\mu _{P(\textbf{x}_d^i)}\) be the distributions \(X_{\#}\mathbb {P}\) and \(Y_{\#}\mathbb {P}\) of the \(\mathbb {R}^2\)—valued random variables \(X(\omega )=(P\textbf{x}_{\omega },AP\textbf{x}_{\omega })\) and \(Y(\omega )=(P(\textbf{x}_d^i)_{\omega },A{P(\textbf{x}_d^i)}_{\omega })\) from Lemma 3.6. \(\square \)

Next, we prove a theorem that allows us to associate, to any measure in \(\mathcal {Z}_\textbf{v}\), a unique measure in \(\mathcal {Z}_{\textbf{v}^i_{\varepsilon }}\), whenever \(\textbf{v}\) is irreducible and \(\varepsilon \) is small enough.

Theorem 3.8

Fix \(A\in \mathcal {S}\) and let \(\textbf{v}\in \mathbb {R}^n\) be an irreducible vector for A whose entries are pairwise distinct. Let \(\varepsilon >0\) be such that:

  1. 1.

    \(\varepsilon <\min \bigl \{\mathrm {d_{LP}}(\nu _1,\nu _2): \nu _1,\nu _2\in \mathcal {Z}_\textbf{v} \text { and }\nu _1\ne \nu _2\bigr \}\), where the minimum exists since \(\mathcal {Z}_\textbf{v}\) is finite;

  2. 2.

    \(\varepsilon \) is small enough that the vectors \(\textbf{v}^i_{\varepsilon }\), for \(i=1,\ldots ,n\), are such that, if \(P\in \mathcal {P}\) and

    $$\begin{aligned} (P\textbf{v})_1<(P\textbf{v})_2<\cdots <(P\textbf{v})_n, \end{aligned}$$

    then

    $$\begin{aligned} (P\textbf{v}^i_{\varepsilon })_1<(P\textbf{v}^i_{\varepsilon })_2<\cdots <(P\textbf{v}^i_{\varepsilon })_n. \end{aligned}$$

Let also \(\nu _1\in \mathcal {Z}_{\textbf{v}}\) and write \(\nu _1=\mu _{P\textbf{v}}\) for some \(P\in \mathcal {P}\). Then, the measure \(\nu _2:=\mu _{P(\textbf{v}_\varepsilon ^i)}\) is the unique measure \(\nu \in \mathcal {Z}_{\textbf{v}_{\varepsilon }^i}\) such that

$$\begin{aligned} \mathrm {\mathrm {d_{LP}}(\nu _1, \nu )}<\tfrac{\varepsilon }{4}. \end{aligned}$$

Proof

First, we prove that \(\mathcal {Z}_{\textbf{v}}\) has cardinality larger or equal than \(\mathcal {Z}_{\textbf{v}^i_{\varepsilon }}\). Fix two distinct elements in \(\mathcal {Z}_{\textbf{v}^i_{\varepsilon }}\) and write them as \(\mu _{P_1(\textbf{v}^i_{\varepsilon })}\) and \(\mu _{P_2(\textbf{v}^i_{\varepsilon })}\), for some \(P_1,P_2\in \mathcal {P}\). If we show that \(\mu _{P_1\textbf{v}}\ne \mu _{P_2\textbf{v}}\), we are done. Assume that \(\mu _{P_1\textbf{v}}= \mu _{P_2\textbf{v}}\). Then, since \(\textbf{v}\) is irreducible, there exists \(P_3\in \mathcal {P}\) such that \(P_2\textbf{v}=P_3P_1\textbf{v}\) and \(P_3A=AP_3\). By Corollary 3.2 and by the choice of \(\varepsilon \), this implies that

$$\begin{aligned} \mu _{P_1(\textbf{v}^i_{\varepsilon })}= \mu _{P_3P_1(\textbf{v}^i_{\varepsilon })}= \mu _{P_2(\textbf{v}^i_{\varepsilon })}, \end{aligned}$$

which is a contradiction since we are assuming that \(\mu _{P_1(\textbf{v}^i_{\varepsilon })}\) and \(\mu _{P_2(\textbf{v}^i_{\varepsilon })}\) are distinct. Therefore, \(\mathcal {Z}_{\textbf{v}}\) has cardinality larger or equal than \(\mathcal {Z}_{\textbf{v}^i_{\varepsilon }}\).

Now, we illustrate the rest of the proof in Fig. 1. By Lemma 3.7 we know that \(\nu _1\) and \(\nu _2\) are such that

$$\begin{aligned} \mathrm {d_{LP}}(\nu _1,\nu _2)<\frac{\varepsilon }{4}. \end{aligned}$$

Fix any \(\nu '_1\in \mathcal {Z}_{\textbf{v}}\) such that \(\nu '_1\ne \nu _1\), and consider the corresponding measure \(\nu '_2\in \mathcal {Z}_{\textbf{v}^i_{\varepsilon }}\) as in Lemma 3.7, so that

$$\begin{aligned} \mathrm {d_{LP}}\left( \nu '_1,\nu '_2\right) <\frac{\varepsilon }{4}. \end{aligned}$$

By the choice of \(\varepsilon \), we have that \(\mathrm {d_{LP}}(\nu _1,\nu '_1)>\varepsilon \) and now, applying the reverse triangular inequality, it is easy to see that \(\mathrm {d_{LP}}(\nu _1,\nu '_2)>\frac{\varepsilon }{4}\) and \(\mathrm {d_{LP}}(\nu '_1,\nu _2)>\frac{\varepsilon }{4}\). \(\square \)

Fig. 1
figure 1

An illustration of the proof of Theorem 3.8

4 Main result

We are now ready to prove Theorem 1.2.

Proof of Theorem 1.2

If we know the set \(\mathcal {Z}_A\) of measures generated by A, then for any \(\textbf{x}\in \mathbb {R}^n\) we also know the set \(\mathcal {Z}_{(A,\,\textbf{x})}\). Hence, we can choose a vector

$$\begin{aligned} \textbf{v}\in \textrm{argmax}_{\textbf{x}\in \mathbb {R}^n}\#\{\mathcal {Z}_{(A,\textbf{x})}\}=\textrm{argmax}_{\textbf{x}\in \mathbb {R}^n}\#\{\mu _{(A,\,P\textbf{x})}: \, P\in \mathcal {P}\}. \end{aligned}$$

It is easy to see that \(\textbf{v}\) must be an irreducible vector for A and, up to a small perturbation, we can choose \(\textbf{v}\) such that all its entries are pairwise distinct. Now, fix \(\varepsilon >0\) as in Theorem 3.8, and choose it also small enough that:

  1. 1.

    For \(i\in \{1,\ldots ,n\}\),

    $$\begin{aligned} \textbf{v}^i_{\varepsilon }\in \textrm{argmax}_{\textbf{x}\in \mathbb {R}^n}\#\{\mathcal {Z}_{(A,\textbf{x})}\}, \end{aligned}$$

    so that also these vectors are irreducible for A;

  2. 2.

    The entries of each \(\textbf{v}^i_{\varepsilon }\) are pairwise distinct.

Fix now \(\nu _1\in \mathcal {Z}_{(A,\,\textbf{v})}\). Then, we know that \(\nu _1=\mu _{(A,\,P_1\textbf{v})}\), for some \(P_1\in \mathcal {P}\), but we don’t know A and \(P_1\). However, we know that there exists \(P_2\in \mathcal {P}\) such that the unique ordered support of \(\nu _1\) is (cf. Example 4.1 below)

$$\begin{aligned} (P_2P_1\textbf{v},\,P_2AP_1\textbf{v}), \end{aligned}$$
(4.1)

and we do know the resulting pair in (4.1). In particular, since we know both \(\textbf{v}\) and \(P_2P_1\textbf{v}\), we can also reconstruct \(P_2P_1\).

Now, fix \(i\in \{1,\ldots ,n\}\) and observe that, since \(P_1,P_2\in \mathcal {P}\), there exists \(j\in \{1,\ldots ,n\}\) such that \(\mathbf {e_j}=P^\top _1P_2^\top \mathbf {e_i}\). By applying Theorem 3.8 to \(\nu _1\), we know that the measure

$$\begin{aligned} \nu _2:=\mu _{\left( A,\,P_1\left( \textbf{v}^j_{\varepsilon }\right) \right) }\in \mathcal {Z}_{\left( A,\,\textbf{v}^j_{\varepsilon }\right) } \end{aligned}$$

is the unique measure \(\nu \in \mathcal {Z}_{(A\,,\textbf{v}^j_{\varepsilon })}\) such that

$$\begin{aligned} \mathrm {\mathrm {d_{LP}}(\nu _1, \nu )}<\tfrac{\varepsilon }{4}. \end{aligned}$$

Hence, we can identify \(\nu _2\) and therefore also its unique ordered support, that we can write as

$$\begin{aligned}&\left( P_2P_1\left( \textbf{v}^j_{\varepsilon }\right) ,\,P_2 AP_1\left( \textbf{v}^j_{\varepsilon }\right) \right) \nonumber \\&\quad =\left( P_2P_1\left( \textbf{v}+\frac{\varepsilon ^2}{64K} \mathbf {e_j}\right) ,\,P_2 AP_1\left( \textbf{v}+\frac{\varepsilon ^2}{64K} \mathbf {e_j}\right) \right) \nonumber \\&\quad =\left( P_2P_1\textbf{v}+\frac{\varepsilon ^2}{64K} \mathbf {e_i},\,P_2AP_1\textbf{v}+\frac{\varepsilon ^2}{64K} P_2AP^\top _2\mathbf {e_i}\right) . \end{aligned}$$
(4.2)

Taking the difference between (4.1) and (4.2) leads to

$$\begin{aligned} \frac{\varepsilon ^2}{64K}\left( \mathbf {e_i},P_2AP^\top _2\mathbf {e_i}\right) , \end{aligned}$$
(4.3)

therefore we are able to reconstruct the \(i-\)th column of \(P_2 AP_2^\top \). Since we can do this for every \(i\in \{1,\ldots ,n\}\), we can reconstruct the entire matrix \(P_2AP^\top _2\). This proves the claim. \(\square \)

We illustrate the first part of the proof of Theorem 1.2 with an example.

Example 4.1

Let \(n=3\), so that \(|\mathcal {P}|=6\), and let

$$\begin{aligned} A:=\begin{pmatrix} 0 &{} 1 &{} 1\\ 0 &{} 1 &{} 0 \\ 1 &{} 1 &{} 0 \end{pmatrix}, \quad P':=\begin{pmatrix} 0 &{} 0 &{} 1\\ 0 &{} 1 &{} 0 \\ 1 &{} 0 &{} 0 \end{pmatrix}\in \mathcal {P}. \end{aligned}$$

Then, \(P'A=AP'\), implying that

$$\begin{aligned} \max _{\textbf{x}\in \mathbb {R}^n}\#\{\mathcal {Z}_{(A,\textbf{x})}\}=\max _{\textbf{x}\in \mathbb {R}^n}\#\{\mu _{(A,\,P\textbf{x})}: \, P\in \mathcal {P}\}\le 3. \end{aligned}$$

Now, by letting

$$\begin{aligned} \textbf{v}:=\begin{pmatrix}v_1 \\ v_2 \\ v_3\end{pmatrix}:=\begin{pmatrix}3 \\ 1 \\ 2\end{pmatrix}, \end{aligned}$$

it is easy to see that

$$\begin{aligned} \#\{\mathcal {Z}_{(A,\textbf{v})}\}=\#\{\mu _{(A,\,P\textbf{v})}: \, P\in \mathcal {P}\}= 3, \end{aligned}$$

therefore

$$\begin{aligned} \textbf{v}\in \textrm{argmax}_{\textbf{x}\in \mathbb {R}^n}\#\{\mathcal {Z}_{(A,\textbf{x})}\}=\textrm{argmax}_{\textbf{x}\in \mathbb {R}^n}\#\{\mu _{(A,\,P\textbf{x})}: \, P\in \mathcal {P}\}, \end{aligned}$$

hence, in particular, \(\textbf{v}\) is irreducible for A. The support of \(\mu _{(A,\,\textbf{v})}\) is illustrated in Fig. 2a. Now, let

$$\begin{aligned} P_1:=\begin{pmatrix} 0 &{} 1 &{} 0\\ 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 \end{pmatrix}\in \mathcal {P}. \end{aligned}$$
Fig. 2
figure 2

The supports of \(\mu _{(A,\,\textbf{v})}\) and \(\mu _{(A,\,P_1\textbf{v})}\) in Example 4.1

Figure 2b shows the support of \(\nu _1:=\mu _{(A,\,P_1\textbf{v})}\), and it is clear from the picture that the unique ordered support of \(\nu _1\) is

$$\begin{aligned} \left( \begin{pmatrix}1 \\ 2 \\ 3\end{pmatrix},\,\begin{pmatrix}5 \\ 4 \\ 3\end{pmatrix} \right) =(P_2P_1\textbf{v},\,P_2AP_1\textbf{v}), \end{aligned}$$

where

$$\begin{aligned} P_2:=\begin{pmatrix} 1 &{} 0 &{} 0\\ 0 &{} 0 &{} 1 \\ 0 &{} 1 &{} 0 \end{pmatrix}\in \mathcal {P}. \end{aligned}$$

We note that, if we consider measures \(\mu _x\) which are relative to a probability measure \(\textbf{p}\) that is not the uniform probability measure, then a matrix A as in Theorem 1.2 will be characterized by a stronger notion than switching equivalence. In particular, in this case \(\mathcal {Z}_A=\mathcal {Z}_B\) if and only if \(A=PBP^{\top }\), where P is such that \(P\textbf{p}=\textbf{p}\). An interesting particular case is when the vector \(\textbf{p}\) has pairwise distinct entries \(p_i\ne p_j\) for \(i\ne j\). In this case, \(\mathcal {Z}_A=\mathcal {Z}_B\) if and only if \(A=B\).

5 Properties of matrices and graphs from the generated measure

We now discuss how some properties of matrices and graphs directly translate in terms of the measure generated by the matrices.

We first notice that, for an \(n\times n\) matrix A and a scalar \(\lambda \in \mathbb {R}\), a measure in \(\mathcal {Z}_A\) is supported on the graph of the linear function \(\lambda x\),

$$\begin{aligned} \hbox {graph}\,(\lambda x):=\{(x,\lambda x)\in \mathbb {R}^2: \ \lambda \in \mathbb {R}\}, \end{aligned}$$

if and only if \(\lambda \) is an eigenvalue of the matrix A. Moreover, we notice that the measure \(\mu \) which is completely supported on the vertical line

$$\begin{aligned} \{(1,x)\in \mathbb {R}^2: \ x\in \mathbb {R}\} \end{aligned}$$

corresponds to the constant 1 vector. For this reason, the set

$$\begin{aligned} \{x\in \mathbb {R}: \ (1, x) \in \textrm{supp}(\mu )\} \end{aligned}$$

corresponds to the row sums of the row of the matrix A.

We now want to relate our measure representation of matrices to graphs. In order to do this, we introduce several matrix representations of graphs. We start with the simplest possible choice.

Definition 5.1

Let \(G=(V,E)\) a graph and on vertices \(v_{1}, \ldots , v_{N}\). The adjacency matrix of G is the \(N\times N\) matrix A, whose entries are

$$\begin{aligned} A_{i j}:= {\left\{ \begin{array}{ll} 1 &{} \text {if}\quad \left( v_{i}, v_{j}\right) \in E \\ 0 &{} \text {otherwise}.\end{array}\right. } \end{aligned}$$

Another possible matrix representation that can be advantageous in some cases is the following.

Definition 5.2

The Kirchhoff Laplacian of G is the \(N\times N\) matrix

$$\begin{aligned} K:=D-A, \end{aligned}$$

where D is the diagonal matrix of the degrees.

We remark that both the adjacency matrix and the Kirchhoff Laplacian matrix are such that two graphs are isomorphic if and only if the respective matrices \(M_1\) and \(M_2\) are related by the the relationship \(M_1=PM_2P\).

Additionally, we recall the following

Definition 5.3

The normalized Laplacian is the \(N\times N\) matrix

$$\begin{aligned} L:=\textrm{Id}-D^{-1 } A, \end{aligned}$$

where \(\textrm{Id}\) is the \(N \times N\) identity matrix. For \(i\ne j\), we have

$$\begin{aligned} L_{i j}=-\frac{A_{i j}}{{\text {deg}} v_{i}}, \end{aligned}$$

which is minus the probability that a random walker goes from \(v_{i}\) to \(v_{j}\).

We refer to [10, 18] for more details on the normalized Laplacian and its spectral theory.

We now consider the above matrices in relation to graphs, and we observe how properties of graphs translate in the family of measure representations.

We consider a graph \(G=(V,E)\) and the related adjacency matrix A. We already discussed how to directly extract the spectral information of a matrix. Additionally, as the degrees of the graph G correspond to the sums of the rows of the adjacency matrix, we can directly determine from \(\mathcal {Z}_A\) the degrees as presented above.

The spectrum and the degree distribution determine many graph properties. For example, the spectrum of the normalized Laplacian determines the number of connected components, and whether the graph is bipartite. Additional properties are characterized by this information. We now recall the following definition, and we refer to [4] for more details on this topic.

Definition 5.4

For two graphs \(F=(V(F),E(F))\) and \(G=(V(G),E(G))\), a graph homomorphism from F to G is a function \(\phi : V(F) \rightarrow V(G)\) such that for \(v,w\in V(F)\) and \(\{v,w\}\in E(F)\) implies \(\{\phi (u),\phi (v)\}\in E(G)\). We denote by \({\text {hom}}(F,G)\) the number of homomorphisms from F to G.

We now give some examples of homomorphism numbers that can be extracted directly from the measure-theoretic representation.

Example 5.5

Let \(S_{k}\) denote the star graph with k nodes. Then, for any graph G on n nodes,

$$\begin{aligned} {\text {hom}}\left( S_{k}, G\right) =\sum _{i=1}^{n} d_{i}^{k-1}, \end{aligned}$$

where \(d_{1}, \ldots , d_{n}\) are the degrees of G.

Example 5.6

Let \(C_{k}\) denote the cycle graph on k nodes, and again let G be any graph on n nodes. Then

$$\begin{aligned} {\text {hom}}\left( C_{k}, G\right) =\sum _{i=1}^{n} \lambda _{i}^{k} \end{aligned}$$

where \(\lambda _{1}, \ldots , \lambda _{n}\) are the eigenvalues of the adjacency matrix of G.

Notice that, in general, it is not easy to reconstruct directly general information about the adjacency matrix A and its powers, because the measure representation does not keep track of permutations of vectors.

6 Relationship with action convergence metric

In this section we briefly present the notion of action convergence from [1], and we underline the relationship with the metric defined in this work. We start by recalling the following

Definition 6.1

A P-operator is a linear operator of the form \(A: L^{\infty }(\Omega ) \rightarrow L^{1}(\Omega )\) such that

$$\begin{aligned} \Vert A\Vert _{\infty \rightarrow 1}:=\sup _{X \in L^{\infty }(\Omega ), \begin{array}{c} X\ne 0 \end{array}}\frac{\Vert AX\Vert _{1}}{\Vert X\Vert _{\infty }} \end{aligned}$$

is finite, where \((\Omega ,\mathcal {F}, \mathbb {P})\) is a generic probability space. We denote by \(\mathcal {B}(\Omega )\) the set of all P-operators on \(\Omega \).

We show, with the following example, that a matrix is a \(P-\)operator.

Example 6.2

If \(\Omega =[n]:=\{1,\ldots ,n\}\) and \(\mathbb {P}=(p_{\omega })_{\omega \in [n]}\) on \(\Omega \) is the probability measure relative to the probability vector \(\textbf{p}=(p_{\omega })_{\omega \in [n]}\), then \(L^{1}(\Omega )=L^{\infty }(\Omega )=\mathbb {R}^{n}\). In this case, \(\mathcal {B}(\Omega )\) is the set of all \(n \times n\) matrices. Thus, every matrix \(A \in \mathbb {R}^{n \times n}\) is a P-operator.

We consider now a \(P-\)operator

$$\begin{aligned} A: L^{\infty }(\Omega )\rightarrow L^1(\Omega ) \end{aligned}$$

and we let

$$\begin{aligned} X\in L^{\infty }(\Omega ). \end{aligned}$$

This is a bounded random variable, and we also have that

$$\begin{aligned} AX\in L^1(\Omega ) \end{aligned}$$

is a random variable with finite expectation. We can therefore define the \(2-\)dimensional random vector (XAX).More generally, for all random variables \(Z_1,Z_2,\ldots ,Z_k\in L^{\infty }(\Omega )\), we can define the \(2k-\)dimensional random vector

$$\begin{aligned} (Z_1,AZ_1,Z_2,AZ_2,\ldots ,Z_k,AZ_k). \end{aligned}$$

We show, with the following example, how the \(2-\)dimensional random vectors constructed above relate to the measures generated by A defined in Sect. 2.

Example 6.3

For the probability space \(\Omega =[n]\), with probability \(\mathbb {P}=(p_{\omega })_{\omega \in [n]}\), a matrix \(A\in \mathbb {R}^{n\times n}\) and a vector \(X=(X(\omega ))_{\omega \in [n]}\in \mathbb {R}^n\) the law of the \(2-\)dimensional random vector (XAX) is

$$\begin{aligned} \sum _{\omega \in [n]}p_{\omega } \delta _{\left( X(\omega ),(AX)(\omega )\right) }. \end{aligned}$$

These, in fact, are the measures generated by A according to Definition 1.1.

We now define the \(k-\)profile of A as

$$\begin{aligned} S_k(A)=\bigcup _{Z_1,\ldots ,Z_k\in L_{[-1,1]}^{\infty }(\Omega )}\{((Z_1,AZ_1,Z_2,AZ_2,\ldots ,Z_k,AZ_k))_{\#}\mathbb {P}\}. \end{aligned}$$

Considering two \(P-\)operators

$$\begin{aligned} A: L^{\infty }(\Omega _1)\rightarrow L^1(\Omega _1) \end{aligned}$$

and

$$\begin{aligned} B: L^{\infty }(\Omega _2)\rightarrow L^1(\Omega _2), \end{aligned}$$

we finally define the action convergence metric.

Definition 6.4

(Metrization of action convergence) For the two P-operators AB the action convergence distance is

$$\begin{aligned} d_{M}(A, B):=\sum _{k=1}^{\infty } 2^{-k} d_{H}\left( \mathcal {S}_{k}(A), \mathcal {S}_{k}(B)\right) \end{aligned}$$

This metric has some nice compactness properties as stated in the following

Theorem 6.5

(Theorem 2.9 in [1]) Let \(p \in [1, \infty )\) and \(q \in [1, \infty ]\). Let \(\left\{ A_{i}\right\} _{i=1}^{\infty }\) be a convergent sequence of P-operators with uniformly bounded \(\Vert \cdot \Vert _{p \rightarrow q}\) norms. Then there is a P-operator A such that \(\lim _{i \rightarrow \infty } d_{M}\left( A_{i}, A\right) =0\), and \(\Vert A\Vert _{p \rightarrow q} \le \sup _{i \in \mathbb {N}}\left\| A_{i}\right\| _{p \rightarrow q}\).

Moreover, action convergence unifies several approaches to graph limits theory. In particular, consider the sequence of adjacency matrices \(A_n\) of graphs \(G_n\), and let \(v_n\) be the number of vertices of \(G_n\). Then,

  • The action convergence of the sequence

    $$\begin{aligned} \frac{A_n}{v_n} \end{aligned}$$

    coincides with graphon convergence [1, Theorem 8.2 and Lemma 8.3]

  • The action convergence of the sequence

    $$\begin{aligned} A_n \end{aligned}$$

    coincides with local–global convergence [1, Theorem 9.2].

We have that the following special class of \(P-\)operators is important in graph limits theory as it naturally generalizes the notion of adjacency matrix of a graph.

Definition 6.6

A positivity-preserving and self-adjoint P-operator is called a graphop.

Consider two \(P-\)operators

$$\begin{aligned} A: L^{\infty }(\Omega _1)\rightarrow L^1(\Omega _1) \end{aligned}$$

and

$$\begin{aligned} B: L^{\infty }(\Omega _2)\rightarrow L^1(\Omega _2). \end{aligned}$$

We can now define the simplified metric

Definition 6.7

(1-profile Metric) For the two P-operators AB the 1-profile distance is

$$\begin{aligned} d_{S}(A, B):= d_{H}\left( \mathcal {S}_{1}(A), \mathcal {S}_{1}(B)\right) \end{aligned}$$

We now compare the \(1-\)profile Metric introduced in this work with the action convergence metric. We have the following

Lemma 6.8

$$\begin{aligned} d_{S}(A, B)\le 2d_{M}(A, B) \end{aligned}$$

Proof

$$\begin{aligned} \begin{aligned} d_{S}(A, B)&=d_{H}\left( \mathcal {S}_{1}(A), \mathcal {S}_{1}(B)\right) \\&\le d_{H}\left( \mathcal {S}_{1}(A), \mathcal {S}_{1}(B)\right) +2\sum _{k=2}^{\infty } 2^{-k} d_{H}\left( \mathcal {S}_{k}(A), \mathcal {S}_{k}(B)\right) \\&\le 2d_{M}(A, B). \end{aligned} \end{aligned}$$

\(\square \)

Moreover, a direct Corollary of Theorem 4 is the following

Corollary 6.9

Let \(K\in \mathbb {R}_{\ge 1}\). Let \(\mathcal {S}\) be the set of \(n\times n\) matrices A such that \(||A||_{\infty \rightarrow 1}\le K\). Then, for matrices \(A, B\in \mathcal {S}\) we get that

$$\begin{aligned} d_{M}(A, B)=0 \end{aligned}$$

if and only if

$$\begin{aligned} d_{L}(A, B)=0 \end{aligned}$$

Notice that, on the space of graph isomorphisms of finite graphs, the \(1-\)profile metric and the action convergence metric both induce the discrete topology and, therefore, we have the following

Corollary 6.10

The \(1-\)profile metric and the action convergence metric are topologically equivalent on the space of graph isomorphisms for graphs with at most n vertices.

This is obviously not clear anymore for general \(P-\)operators as the topology induced by the two metrics is not anymore discrete.

7 Future directions

In future work, we aim to better understand when the action convergence metric is topologically equivalent to the simplified 1-profile Metric we introduced. This would contribute to a better understanding of action convergence, potentially giving new insight about the difference between the convergence of dense graph sequences and sparse/bounded degree sequences. The 1-profile metric could be related to weaker notions of local–global convergence for bounded degree graph sequences, as the notion of Benjamini–Schramm convergence. A better understanding of these types of metrics could also help to understand if limiting the number of colorings in the notion of local–global convergence to 2 or more does actually change the notion of convergence, as asked in [12].