Multiscale transforms for signals on simplicial complexes

Our previous multiscale graph basis dictionaries/graph signal transforms—Generalized Haar-Walsh Transform (GHWT); Hierarchical Graph Laplacian Eigen Transform (HGLET); Natural Graph Wavelet Packets (NGWPs); and their relatives—were developed for analyzing data recorded on vertices of a given graph. In this article, we propose their generalization for analyzing data recorded on edges, faces (i.e., triangles), or more generally κ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\kappa $$\end{document}-dimensional simplices of a simplicial complex (e.g., a triangle mesh of a manifold). The key idea is to use the Hodge Laplacians and their variants for hierarchical partitioning of a set of κ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\kappa $$\end{document}-dimensional simplices in a given simplicial complex, and then build localized basis functions on these partitioned subsets. We demonstrate their usefulness for data representation on both illustrative synthetic examples and real-world simplicial complexes generated from a co-authorship/citation dataset and an ocean current/flow dataset.

toolbox of applied mathematics.This rise in popularity is largely due to the adaptation of discrete differential geometry [11] in applications in computer vision [32,40], statistics [28], topological data analysis [6,47], and network analysis [46].
One of the key challenges to applying wavelets and similar constructions to vertexbased graph signals is that graphs lack a natural translation operator, which prevents the construction of convolutional operators and traditional Littlewood-Paley theory [22,29,48].This challenge is also present for general κ-dimensional simplices.One method for overcoming this difficulty is to perform convolution solely in the "frequency" domain and define wavelet-like bases entirely in the coefficient space of the Laplacian (or in this case Hodge Laplacian) transform.Following this line of research, there have been several approaches to defining wavelets [39] and convolutional neural networks [14] in which the input signal is transformed in a series of coefficients in the eigenspace of the Hodge Laplacian.Unfortunately, the atoms (or basis vectors) generated by these methods are not always locally supported, and it can be difficult to interpret their role in analyzing a given graph signal.
An alternative path to the creation of wavelet-like dictionaries and transforms is to first develop a hierarchical block decomposition of the domain and then use this to develop multiscale transforms [21,20,44].These techniques rely on recursively computing bipartitions of the domain and then generating localized bases on the subsets of the domain.In this work, we propose a simplex analog to the Fiedler vector [19] to solve a relaxed version of the simplex-normalized-cut problem, which we can apply iteratively to develop a hierarchical bipartition of the κ-dimensional simplices in a simplicial complex.From here, we are able to apply the general scheme of [21] and [20] to develop the Hierarchical Graph Laplacian Eigen Transform and the Generalized Haar-Walsh Transform, respectively, for a given collection of simplices of an arbitrarily high order.As a result, we can also generate orthonormal Haar bases, orthonormal Walsh bases, as well as data-adaptive orthonormal bases using the best-basis selection method [10].
The main challenge in lifting these transforms to the simplicial setting lies in the simplex orientations, which cause the resulting Laplacians generally to contain mixed positive and negative off-diagonal elements.We are no longer guaranteed a non-negative Perron vector [1] to use as a DC component, and so must incorporate the orientation information both in order to develop a Fiedler vector appropriate for partitioning the κ-dimensional simplices of a complex, and for interpreting the nature of the resulting partition.Further challenges lie in there being multiple ways to define adjacency between simplices, multiple ways to generalize simplex weights of pairs of adjacent simplices and multiple ways to balance the "upper" and "lower" parts of the Hodge Laplacian.

Outline. This article is organized as follows:
In Section 2 we formally describe simplicial complexes and how their geometry leads to notions of adjacency and orientation.This allows us to define discrete differential operators acting on signals defined on the complex, which in turn are constructed from boundary operators that map between the κ and κ ± 1 degree faces of the complex.In Section 3 we use these boundary operators to describe the Hodge Laplacian and discuss several different variants, some analogous to different normalizations of the graph Laplacian and some more novel.In Section 4 we show how the eigenvectors of the Hodge Laplacian can be use to solve relaxed-cut-like problems to partition a complex.We also develop hierarchical bipartitions, which decompose a given complex roughly in half at each level until we are left with a division into individual elements.In Section 5 we use these bipartitions to develop orthonormal Haar bases.In Section 6, we create overcomplete dictionaries based on given bipartitions and, as a consequence, are also able to define a canonical orthonormal Walsh basis.At the end Fig.1: In this small 2-complex C , e 1 ∼ e 4 because they share the face v 2 , and e 1 ∼ e 2 because they share the face v 1 .Further e 1 ≃ e 2 because their hull t 1 ∈ C , but e 1 e 4 , so that e 1 ∼ 1 e 4 .We have t 1 ∼ t 2 because they share the face e 3 , but as hl(t 1 , t 2 ) ∉ C , we have t 1 ∼ 2 t 2 .
of this section we state two theorems which bound the decay rate of the dictionary coefficients and approximation power of our dictionaries.In Section 7, we present numerical experiments on both illustrative synthetic examples and real-world problems in signal approximation, clustering, and supervised classification.Finally, we conclude this article with Section 8 discussing potential future work.
We have implemented our multiscale simplicial signal transforms in Julia and Python, and code which builds the corresponding basis dictionaries, and was used to generate the figures in this article, is available at: https://github.com/UCD4IDS/MultiscaleSimplexSignalTransforms.jl.

Simplicial Complexes.
In this section we review concepts from algebraic topology to formally define simplicial complexes and introduce some notions of how two simplices can be "adjacent."For a more thorough review, see [5,17].Given a vertex set V = {v 1 , . . ., v n }, a κ-simplex σ is a (κ + 1)-subset of V .A face of σ is a κ-subset of σ, and so σ has κ + 1 faces.A co-face of σ is a (κ + 1)-simplex, of which σ is a face.
Suppose σ = {v i 1 , . . ., v i κ+1 }, i 1 < • • • < i κ+1 , and α ⊂ σ is its face.Then, σ\α consists of a single vertex; let v i ℓ * be that vertex where 1 ≤ ℓ * ≤ κ + 1.Then the natural parity of σ with respect to its face α is defined as nat(σ, α) := (−1) ℓ * +1 .When α is not a face of σ, nat(σ, α) = 0.The natural parity of κ-simplices with respect to their faces generalizes the idea of a directed edge having a head vertex and a tail vertex, and is "natural" because it disallows situations analogous to a directed edge with two heads or two tails.
A simplicial complex C is a collection of simplices closed under subsets, where if We also refer to C as a κ-complex to note that κ max (C ) = κ.Let a κ-region of C refer to any non-empty subset of C κ .

Oriented Simplicial Complexes and Boundary
Operators.An oriented simplex σ further has an orientation p σ ∈ {±1}, which indicates whether its parity with its faces is the same as, or opposite to, its natural parity.When p σ = +1, we say σ is in natural orientation.For example, a directed edge e = (v i , v j ) for i < j is in natural orientation, while if i > j , p e = −1.An oriented simplicial complex contains at most one orientation for any given simplex.
Let X κ be the space of real-valued functions on C κ for each κ ∈ {0, 1, . . ., κ max (C )}.In the case of graphs, X 0 consists of functions taking values on vertices, or graph signals.X 1 consists of functions on edges, or edge flows.A function in X 1 is positive when the corresponding flow direction agrees with the edge orientation, and negative when the flow disagrees.X 2 consists of functions on oriented triangles.
Given an oriented simplicial complex C , for each κ ∈ {0, 1, . . ., κ max }, the boundary operator is a linear operator B κ : X κ+1 → X κ , where for σ ∈ C κ+1 , α ∈ C κ , the corresponding matrix entries are [B κ ] ασ = p σ p α nat(σ, α).Likewise, the coboundary operator for each κ ∈ {0, 1, . . ., κ max } is just B T κ : X κ → X κ+1 , the adjoint to B κ .The expression of the entries of B κ as the relative orientation between simplex and face suggests that these are a natural way to construct functions taking local signed averages, according to adjacency in the simplicial complex.

Data on Simplicial
Complexes.Signal processing on simplicial complexes arises as a natural problem in the setting where richer structure is incorporated in data, than just scalar functions and pairwise relationships.In this article, we assume the input data is given on an existing simplicial complex.
A simple directed graph G = (V, E ) can always be represented as an oriented 1-complex G, with each directed edge e = (v i , v j ) inserted as a 1-simplex having orientation p e = sign( j − i ).With this convention, natural orientation corresponds to the agreement of the edge direction with the global ordering of the vertices.
3. Hodge Laplacian.The boundary operators just introduced represent discrete differential operators encoding the structure of κ-regions in a simplicial complex, and so can be building blocks towards a spectral analysis of functions on those regions.For analyzing functions on κ-simplices with κ > 0, we will construct operators based on the Hodge Laplacian, or κ-Laplacian.As in [32], the combinatorial κ-Laplacian is defined for κ-simplices as We refer to κ as the lower and upper κ-Laplacians, respectively.

Simplex consistency.
Let C be an oriented simplicial complex, and σ ∼ τ ∈ C κ , with α = bd(σ, τ).Then we may write L κ as diag(L κ ) − S κ , where for κ > 0, S κ is the signed adjacency matrix When S κ > 0, we say σ, τ are consistent, and otherwise they are inconsistent.A consistent pair of simplices view their shared boundary face in opposite ways; one as a head face, and the other as a tail face.An inconsistent pair of simplices view their shared boundary face identically.In the case of κ = 1, two directed edges are consistent when they flow into each other at their boundary vertex, and are inconsistent when they collide at the boundary vertex, either both pointing toward it, or both pointing away.Cases for κ = 1, 2 are demonstrated in Figure 2. The combinatorial κ-Laplacian represents signed adjacency between κ-adjacent simplices via their consistency.In particular, this means that L κ depends only on the orientations of simplices in C κ .Naively, constructing the boundary matrices B κ−1 , B κ then additionally requires superfluous sign information -the orientation of each member of both C κ−1 and C κ+1 .This situation exactly mirrors that of the graph Laplacian L 0 : in order to construct L 0 for an undirected graph via the product B 0 B T 0 , one must assign an arbitrary direction to each edge, and the resulting Laplacian is independent of that choice of directions.

Weighted and Normalized Hodge Laplacians.
In order to introduce a weighted simplicial complex, consider the symmetrically normalized graph Laplacian where D 0 = diag(|B 0 |D 1 1), the diagonal matrix of (weighted) vertex degrees, and D 1 is the diagonal matrix of edge weights.Letting D κ generally refer to a diagonal matrix containing κ-simplex weights, we proceed as in [6] and define the weighted symmetrically normalized κ-Laplacian as where Here D ℓ = diag(|B ℓ |D ℓ+1 1) for ℓ = κ − 1, κ, and D κ+1 is the diagonal matrix of (κ + 1)-hull weights.
From L sym κ we may define the usual weighted unnormalized, and weighted randomwalk normalized κ-Laplacians L wt κ and L rw κ , via the formulas: While in the combinatorial case, L κ vanishes for pairs σ ≃ τ, each of the weighted Laplacians may be nonzero whenever σ ∼ τ.
The signed weighted adjacency matrices S sym κ , S wt κ , S rw κ are defined analogously to S κ , as the negative of the off-diagonal parts of their respective Laplacians. Figure 3 demonstrates S κ , S sym κ , S wt κ , and S rw κ for a simple complex.
For κ = 0, with the vertices of G in natural orientation, we have that λ 0 = 0, λ 1 > 0, φ 0 = 1 and in particular is non-oscillatory, and that φ 1 acts as a single global oscillation, Fig. 3: The complex from Figure 1 on the left, with natural orientation displayed as directed edges and oriented triangles, together with four signed adjacency matrices, the combinatorial S 1 , the weighted symmetrically normalized S sym 1 , the weighted unnormalized S wt 1 , and the weighted random-walk normalized S rw 1 , all with D 2 = I .Notice that the weighted variants may have nonzero entries of various sign even for strongly adjacent simplices, unlike S 1 .
appropriate to partition the vertices of G with.Considering Lwt 0 for nontrivial p ±1, φ0 is oscillatory, and φ1 is no longer appropriate for clustering; this is one reason that oriented 0-simplices are always considered to be in natural orientation.
For κ > 0 however, it is no longer true that φ 0 will be non-oscillatory.Let p * be a vector of orientations such that where [φ 0 ] σ 0, [p * ] σ = sign([φ 0 ] σ ).Then the corresponding φ0 is non-oscillatory, and acts as a DC component.This motivates taking sign(φ 0 ) ⊙ φ 1 (element-wise) as the Fiedler vector of L wt κ , with which to partition C κ .We will aim to bipartition κ-regions by following a standard strategy in spectral clustering, of minimizing a relaxation of a combinatorial cut function over possible partitions.Just as a graph cut is typically defined as the volume of edge weight which crosses a partition of the vertices, we can define the consistency cut of C κ into subregions A, B as Because of the signs introduced by consistency, we consider S wt κ as the signed, weighted adjacency matrix for a signed graph over C κ , and so can utilize the framework of signed Laplacians [30].

, indicator functions for consistent/inconsistent pairs, respectively. Then, we can define the consistency volume Cvol
In the κ = 0 case, with all vertices in natural orientation, S wt 0 is just the usual adjacency matrix, and so S − 0 = 0; hence κCut = 2 Ccut, yielding the traditional cut objective.For κ > 0, κCut increases with the number of consistent pairs of κ-adjacent simplices across the partition, and with the number of inconsistent pairs within each κ-region.Equivalently, minimizing κCut requires maximizing consistent pairs within each κ-region, and maximizing inconsistent pairs across the partition.
Let L κ be the signed Laplacian with signed adjacency S wt κ .Let A be a κ-region, r A 1 A − 1 C κ \A , and define R A (L) r T A Lr A .Then because L κ differs from L wt κ only on the diagonal, R A (L κ ) differs from R A (L wt κ ) by a constant independent of A. From [30], we know that and we obtain φ 0 as a relaxed solution to κ-cut minimization.Now, notice that if the orientations of C κ were changed according to some p, this would be equivalent to a different choice of A; namely, if [p] σ = −1, then σ moves to the other side of the partition, either into or out of A. As all orientations are available to us, this includes one for which φ0 is non-oscillatory, so that its sign does not partition C κ .We then instead take φ1 as our relaxed solution, which we may compute via sign(φ 0 ) ⊙ φ 1 .
An improved cut objective is the signed Ratio Cut, which encourages more balanced partitions: From [30], we know that with r A above scaled by a factor of c A |A|/|C κ \ A|, the analogous result holds, that the eigenvectors of L κ yield a relaxed solution to min A⊂C κ SignedRatioCut(A).However, the new dependence on A means the resulting objective is slightly different for L κ , so the relaxation is only approximate.
Finally, the signed Normalized Cut balances the partitions by degree rather than simplex count: Here, the eigenvectors of diag(L κ ) −1 L κ yield a relaxed solution to min A⊂C κ SignedNormalizedCut(A), and an approximate relaxed solution is given by the eigenvectors of L rw κ .In our numerical experiments, we use the random-walk κ-Laplacian for bipartitioning simplicial complexes, and obtain its eigenvectors from those of L sym κ .

Hierarchical Bipartitions.
The foundation upon which our multiscale transforms on a κ-simplices C κ of a given simplicial complex C are constructed is a hierarchical bipartition tree (also known as a binary partition tree) of C κ , a set of tree-structured κsubregions of C κ constructed by recursively bipartitioning C κ .This bipartitioning operation ideally splits each κ-subregion into two smaller κ-subregions that are roughly equal in size while keeping tightly-connected κ-simplices grouped together.More specifically, let C j k denote the k t h κ-subregion on level j of the binary partition tree of C κ and n ).This partitioning is recursively performed until each subregion corresponding to the leaf contains only a simplex singleton.Note that k ′ = 2k if the resulting binary partition tree is a perfect binary tree.We note that even other (non-spectral) partitioning methods can be used to form the binary partition tree, but in this article, we stick with the spectral clustering using the Fiedler vectors.For more details see on hierarchical partitioning, (specifically for the κ = 0 case), see [26,Chap. 3] and [44].Figure 4 demonstrates such a hierarchical bipartition tree for a simple 2-complex consisting of triangles.

Orthonormal κ-Haar
Bases.The classical Haar basis [18] was introduced in 1909 as a piecewise-constant compactly-supported multiscale orthonormal basis (ONB) for squareintegrable functions but has since been recognized as a wavelet family and adapted to Fig. 4: One possible hierarchical bipartitioning of a simple 2-complex, from j = 0 with no partition on the left, to j = 5 on the right, where each of the 21 triangles forms their own subregion.Colors indicate distinct subregions.We have highlighted the subregions that contain more than one element for j = 4. many domains.In one dimension, the family of Haar wavelets on the interval [0, 1] can be generated by the following mother and scaling (or father) functions: Unfortunately, these definitions do not generalize to non-homogeneous domains due to the lack of appropriate translation operators and dilation operators [48].Instead, several methods have been proposed to generate similar bases, and overcomplete dictionaries to apply more abstract domains such as graphs and discretized manifolds [20,49,44].Here, we describe a method to compute similar, piecewise-constant locally supported bases for κ-simplex-valued functional spaces, which we call the (orthonormal) κ-Haar bases.
Rather than basing our construction on some kind of translation or transportation schemes, we instead employ the hierarchical bipartition, as we discussed in Section 4.2, to divide the domain, i.e., the κ-simplices C κ of a given simplicial complex C into appropriate locally-supported κ-regions.For each κ-region in the bipartition tree, if that region has two children in the tree, then we create a vector that is positive on one child, negative on the other, and zero elsewhere.To avoid sign ambiguity, we dictate that the positive portion is on the region whose region index is smaller among these two.See Algorithm 1 for the detail.
Several remarks on this basis are in order.First, since the division is not symmetrically dyadic, we need to compute the scaling factor for each region separately.For each given basis vector ξ except the scaling vector, we break it into positive and negative parts ξ + and ξ − and ensure that i ([ξ + ] i + [ξ − ] i ) = 0 and ∥ξ∥ = 1.If the members of κ-region are weighted, then this sum and norm can be computed with respect to those weights.Finally, we note that different hierarchical bipartition schemes may arise from the different weighting of the Hodge Laplacian, which will correspond to bases with different supports.Figure 5 demonstrates a 2-Haar basis based on the partition shown in Figure 4.

κ-Hierarchical Graph Laplacian Eigen Transform (κ-HGLET).
The first overcomplete transform we describe can be viewed as a generalization of the Hierarchical Block Discrete Cosine Transform (HBDCT).The classical HBDCT is generated by creat- ) ; i = i + 2; ing a hierarchical bipartition of the signal domain and computing the DCT of the local signal supported on each subdomain.We note that a specific version of the HBDCT (i.e., a homogeneous split of an input image into a set of blocks of size 8 × 8 pixels) has been used in the JPEG image compression standard [38].This process was generalized to the graph case in [21], i.e., the Hierarchical Graph Laplacian Eigen Transform (HGLET), from which we base our algorithm and notation.In turn, our κ-HGLET is a generalization of the HGLET for κ-simplices in a given simplicial complex.We organize this dictionary by grouping the elements into j max +1 orthonormal matrices Φ j j max j =0 where Φ j ∈ R n×n represents the orthonormal basis formed from the j th level of the bipartition.More specifically, let {φ j k,l } be the basis vectors in the κ-HGLET where j denotes the level of the partition (with j = 0 being the root), k indicates the partition within the level, and l indexes the elements within each partition in increasing frequency.
To compute the transform, we first compute the complete set of eigenvectors {φ 0 0,l } l =0:n−1 Fig. 6: 2-HGLET dictionary a 2-complex.Each row represents a different level j and the subregions within each level are shown with the boxes.Here, the color scale is consistent across each row to better visualize the smoothness of the elements.Note that we can generate an ONB by selecting any subset of boxes so that the union of those boxes contains exactly one element from each column.
of the Hodge Laplacian of the entire κ-simplices C κ of a given simplicial complex C and order them by nondecreasing eigenvalues.We then partition C κ into two disjoint κ-regions C 1 0 and C 1 1 as described in Section 4. We then compute the complete set of eigenvectors of the Hodge Laplacian on C 1 0 and C 1 1 .We again order each set by nondecreasing frequency (i.e., eigenvalue) and label these {φ 1 0,l } l =0:n 1 0 −1 and {φ 1  1,l } l =0:n 1 1 −1 Note that n 1 0 +n 1 1 = n 0 0 = n, and that all of the elements in {φ 1 0,l } are orthogonal to those in {φ 1  1,l } since their supports are disjoint.Then the set {φ 1 0,l } l =0:n 1 0 −1 ∪ {φ 1 1,l } l =0:n 1 1 −1 form an orthonormal basis for vectors on C κ (after extending these vectors beyond their support by zeros).From here, we apply this process recursively, generating an orthonormal basis for each level in the given hierarchical bipartition tree.This process is detailed in Algorithm 2.
If the hierarchical bipartition tree terminates at every region containing only a κsimplex singleton, then the final level will simply be the standard basis of R n .Each level of the dictionary contains an ONB whose vectors have the support of roughly half the size of the previous level.There are roughly (1.5) n possible ONBs formed by selecting different covering sets of regions from the hierarchical bipartition tree; see, e.g., [54,44] for more about the number of possible ONBs in such a hierarchical bipartition tree.Finally, we note that the computational cost of generating the entire dictionary is O (n 3 ) and that any valid hierarchical bipartition tree can be used to create a similar dictionary.Figure 6 shows the 2-HGLET constructed on the 2-complex and bipartition shown in Figure 4 .

κ-Generalized Haar-Walsh Transform (κ-GHWT).
The second transform we present here is based on the Generalized Haar-Walsh Transform (GHWT) [20], which can itself be viewed as a generalization of the Wash-Hadamard transform.This basis is formed by first generating a hierarchical bipartition tree of C κ .We then work in a bottom-up manner, beginning with the finest level in which each region only contains a single element.We call these functions scaling vectors and label them {ψ j max k,0 } k=0:n−1 .For the next level, we first assign a constant scaling vector for support on each region.Then, for each region that contains two children in the bipartition tree, we form a Haar-like basis element by subtracting the scaling function associated with the child element with a higher index from that child element with a lower index.This procedure will form an ONB {ψ Result: A multiscale overcomplete dictionary Φ j j max j =0 for j 0, . . ., j max do if j = 0 then Compute Φ 0 as solution to eigensystem L κ φ = λφ ; else for k 0, . . ., ; j max − 1 and l (k) = 1 or 2 depending on the subregion k) whose vectors have support of at most 2. For the next level, we begin by computing the scaling and Haar-like vectors as before.Next, for any region that contains three or more elements, we also compute Walshlike vectors by adding and subtracting the Haar-like vectors in the children's regions.From here, we form the rest of the dictionary recursively.A full description of this algorithm (for the κ = 0 case) is given in [21] and we present a generalized version in Algorithm 3.For ease of notation, we present the algorithm for an unnormalized dictionary.We would prefer a normalized dictionary in many applications, but the choice of norm can be problem dependent and can be done by simply looping thought the elements once the unnormalized dictionary has been created.Figure 7 displays the 2-GHWT dictionary on the same 2-complex used in Figures 5  and 6.We make several observations about this dictionary.First, like the κ-HGLET, each level of the dictionary forms an ONB, and at each level, basis vectors have the support of roughly half the size of the previous level.These basis vectors also have the same support as the κ-HGLET basis vectors (that is, supp(φ j k,l ) = supp(ψ j k,l ) for all j , k, l ).However, the computational cost of computing the κ-GHWT is only O (n log n) compared to the O (n 3 ) of the κ-HGLET.
Finally, we note that at the coarsest level ( j = 0) the κ-GHWT dictionary contains globally-supported piecewise-constant basis vectors, which are ordered by increasing oscillation (or "sequency").This forms an ONB analogous to the classical Walsh Basis.This allows us to define an associated Walsh transform and conduct Walsh analysis on signals defined on simplicial complexes.Although not the primary focus of this article, we conduct some numerical experiments using the Walsh bases explicitly in Section 7.

6.3.
Organizing the Dictionaries.For many downstream applications, it is important to organize the order of these bases.In general, the κ-HGLET dictionary is naturally ordered in a Coarse-to-Fine (C2F) fashion.In each region, the basis vectors are ordered by frequency (i.e., eigenvalue).Similarly, the GHWT dictionary is also naturally ordered Fig. 7: Coarse-to-Fine (C2F) 2-GHWT dictionary.The yellow, dark green, and violet regions in each vector indicate its positive, zero, and negative components, respectively.Each row represents a different level j and the subregions within each level are shown with the boxes.Similarly to the 2-HGLET dictionary in Figure 6, we can generate an ONB by selecting any subset of boxes so that the union of those boxes contains exactly one element from each column.Result: An (unnormalized) κ-GHWT dictionary {ψ j k,l } for j j max , . . ., 1 do if j = j max then for k = 0, . . ., n − 1 do Set ψ ,l ; else if only one subregion has a basis vector with tag l then Set ψ j −1 k,2l = ψ j k ′ ,l ; else if neither subregion has a basis vector with tag l then Do nothing Fig. 8: Fine-to-Coarse (F2C) 2-GHWT dictionary.Note that this dictionary is not generated by simply reversing the row indices of the C2F dictionary, but by also arranging each level (row) by "sequency."That is, each row is first sorted by tag l (shown by the black boxes) then each tag is sorted by region index k.The vectors enclosed by the red boxes form the Haar basis for this 2-complex while the vectors in the bottom row form the Walsh basis.
in a C2F fashion, with increasing "sequency" within each subgraph.Another useful way to order the GHWT is in a Fine-to-Coarse (F2C) ordering, which approximates "sequency" domain partitioning.See, e.g., Figure 8, which shows the F2C 2-GHWT dictionary on the same 2-complex used in Figure 7.We also note that the F2C ordering is not possible for the κ-HGLET dictionary because some parent subregions and the direct sum of their children subregions are not equivalent; see, e.g., [26,Eq. (5.6)] for the details.Other relabeling schemes, such as those proposed in [49,44] may also be useful but are outside the scope of this article and will be explored further in our future work.

Basis and Frame
Selection.Once we have established these arrangements of basis vectors, we can efficiently apply the best-basis algorithm [10] to select an ONB that is optimal for a task at hand for a given input signal or a class of input signals; see also our previous work of applying the best-basis algorithm in the graph setting [21,20,22,25,49,8,44].Given some cost function F and signal x, we traverse the bipartition tree and select the basis that minimizes F restricted to each region.For the C2F dictionary, we initialize the best basis as the finest ( j = j max ) level of the GHWT dictionary.We then proceed upward one level at a time and compute the cost of each subregion at that level and compare it to the cost of the union of its children subregions.If the latter cost is lower, the basis is updated; if not, the children subregions (and their basis vectors) are propagated to the current level.This algorithm yields the C2F best basis.The F2C best basis is performed similarly, i.e., we begin with the globally-supported basis ( j = 0) at the bottom of the rearranged tree and proceed in the same bottom-up direction.As for the HGLET dictionary, it has only a C2F basis as we discussed earlier.
In some contexts, it is not necessary to generate a complete ONB, but rather some sparse set of vectors in the dictionary (also known as atoms) that most accurately approximate a given signal or class of signals.In this case, we can directly apply the orthogonal matching pursuit of [4] to find the best m-dimensional orthogonal subframe (m ≤ n) selected from the dictionary.Additionally, for some downstream tasks, such as sparse approximation or sparse feature selection, generating orthogonal sets of atoms is not critical.In these cases, we can employ a greedy algorithm to generate efficient approximation.This algorithm simply selects the atoms in the dictionary with the largest coefficient, removes it, then computes the transform of the residual and proceeds so forth.This algorithm is quite expensive since it need to recompute the coefficients after each selection.Therefore, it is only suited for tasks when the number of elements are small, or we only need to compute a few features.These algorithms are studied extensively in the subsequent section.

Approximation Theory.
Signal approximation has been one of the most important applications for classical and graph wavelet bases.Theoretical justification of the efficacy often relies on developing approximation bounds and decay rates for the wavelet coefficients for various classes of signals under various norms (or seminorms).A number of these results have been specifically developed for graphs equipped with a hierarchical tree [16,9,50,23].In general, these results are based on first defining a distance function between vertices of a graph, then defining a Hölder seminorm based on this distance function, and then finally computing the decay rates.By defining a distance between any two κ-simplices as the number of elements in the smallest partition in the tree that constrains both elements, we can generalize these results.Formally, for singleton κ-elements σ and τ of C κ , and signal f , we define a distance function and then the associated Hölder seminorm as: where α is a constant in (0, 1].With these definitions, the dictionary coefficient decay and approximation results of [50,23] for the GHWT and HGLET can be applied to the κ-GHWT and κ-HGLET bases.For sake of space, we only state these theorems and do not reproduce the proofs since the adaptation is trivial after substituting the above definitions.THEOREM 6.1.For a simplicial complex C equipped with a hierarchical bipartition tree, suppose that a signal f is Hölder continuous with exponent α and constant C H ( f ).Then the coefficients with l ≥ 1 for the κ-HGLET (c j k,l ) and κ-GHWT (d j k,l ) satisfy: Proof.See Theorem 3.1 of [23].THEOREM 6.2.For a fixed orthonormal basis {φ l } n−1 l =0 and a parameter 0 < ρ < 2, where P m in the best nonlinear m-term approximation in the basis Proof.See Theorem 3.2 of [23] and Theorem 6.3 of [50].

Numerical Experiments.
We demonstrate the efficacy of our proposed partitioning techniques and basis constructions by conducting a series of experiments.In Section 7.1 we show how our multiscale bases and overcomplete dictionaries can be used to sparsely approximate signals defined on κ-simplices.In Section 7.2 we show how these representations can be used in supervised classification and unsupervised clustering problems.

Approximation and Signal Compression.
We begin with an illustrative example by creating some synthetic data for 1-and 2-simplices by triangulating a digital image.We start with a 512 × 512 "peppers" image and map it to a Cartesian grid on the unit square Fig. 9: Nonlinear approximation of the peppers image for κ = 2 [0, 1] 2 .We then randomly sample 1024 points within this square (not necessarily on a grid).We then create a triangular mesh from these points using Delaunay triangulation.Next, we interpolate the image from the Cartesian grid to the sampled vertices by computing the barycentric coordinate of each vertex from the square inside the Cartesian grid.Finally, we interpolate the signal to the edges and triangles of the triangulation by averaging the values of the vertices that they contain.The result, for our random seed, is a signal defined on the 3050 edges of the triangulation and another on the 2067 triangles.We now consider the sparse representation of these signals.Figure 9 shows the nonlinear approximation (i.e., using the largest expansion coefficients in magnitude) of the triangle-based signals in the Hodge Laplacian eigenbasis (Fourier), the orthonormal Haar basis, orthonormal Walsh basis as well as the approximation prescribed by applying the best-basis and greedy algorithms to the HGLET and GHWT dictionaries.Figure 10 shows the approximation error vs the number of terms used for both the edge-based and triangle-based functions.
A number of observations are in order.First, the multiscale dictionary-based methods consistently outperformed the generic orthonormal bases.The greedy approximation algorithm achieved the best approximation results, but it is also more costly to compute than any of the other methods, and the set of atoms used in the approximation may not be orthogonal.This may be detrimental to downstream tasks.Overall the GHWT-based method performed best, with the F2C best basis performing much better than the C2F best basis, which suggests that the fine-scale features of this signal are the most important.Similarly, the Walsh basis achieved much better results than the Haar basis, again emphasizing the necessity of capturing details at the fine scale.
Next, we apply our approach to real-world data containing higher-order simplices with κ = 0, 1, . . ., 5. The citation complex [37,14] is a simplicial complex derived from the co-author/citation complex (CC) [53], which models the interactions between multiple authors of scientific papers.A paper with κ authors is represented by a (κ − 1)-simplex.We first build a graph whose vertices represent the authors in this CC database.Then, the vertices are connected by edges that represent co-authored papers.Note that if two authors co-authored multiple papers, these two vertices are connected by a single edge.Next, we assign each edge the sum of the citation numbers of all the co-authored papers by the authors, forming this edge as its weight (or value).Finally, we assign each higherorder simplex the sum of the values of its lower-order simplices as its value.See [14] for a more thorough description of the construction of this complex.Table 1 reports some basic information about the number of simplices of different dimensions in this citation complex.Figure 11 shows the nonlinear approximation of this signal (i.e., a vector of citation numbers) for κ = 0, 1, . . ., 5 with the Delta, Fourier, Haar, HGLET, and GHWT bases.Figure 12 shows the log error.The HGLET and GHWT bases were selected by the best-basis algorithm.
In these experiments, we observe that the best bases (GHWT and HGLET) outper- formed the canonical bases, with the GHWT being the most efficient basis for each κ.Additionally, for κ > 0, the orthonormal Haar basis performed best in the semi-sparse regime (1 and 10% of terms retrained).This suggests that the signals on each dimension of the citation complex are similar in that they are all close to being piecewise constant.However, when more terms are considered, the HGLET best basis achieved a lower approximation error than the orthonormal Haar basis achieved.

Signal Clustering and Classification.
Since the basis (and dictionary) vectors we present are both multiscale and built from the Hodge Laplacians that are aware of both topological and geometric properties of the domain [6], they can function as very powerful feature extractors for general data science applications.In this section, we present two downstream applications: 1) a supervised classification problem; and 2) an unsupervised clustering problem.For baselines, we compare our proposed dictionaries with Fourier and Delta (indicator function) bases and with the Hodgelets proposed in [39] for cases when κ = 1.

Supervised Classification.
First, we present our study in supervised classification.We begin by computing edge-valued signals for 1000 handwritten digits from the MNIST dataset [31] by sampling 500 points in the unit square and following the interpolation method presented for the peppers image in Section 7.1.We then compute the features of these images using the proposed orthogonal transforms and best bases from the overcomplete dictionaries.Next, we train a support vector machine (SVM) to classify the digits for each of the transformed representations using the 1000 training examples.Finally, we test these SVMs on the rest of the whole MNIST dataset.We repeat this experiment for the FMNIST dataset [56], again using only 1000 examples for training data.Results are presented in Table 2.We remark that these tests are not meant to achieve state-of-the-art results for image classification but rather to showcase the effectiveness of these representations for downstream tasks.Unsurprisingly, the signal-adaptive dictionary methods outperformed the non-adaptive basis methods.Again, the piecewise-constant methods (GHWT, Haar) achieved better approximations than the smoother methods (Fourier, HGLET, Joint, and Separate Hodgelets).This is likely due to the near-binary nature of images in both datasets.

Unsupervised Clustering.
A natural setting for studying signals on 1-simplices C 1 is the analysis of trajectories [6,40,39].Of particular interest is the case where the domain has nontrivial topological features.Such is the case of the Global Drifter Program dataset, which tracks the positions of 334 buoys dropped into the ocean at various points around the island of Madagascar [39].
We begin by dividing the dataset into three subsets, train (|X tr | = 176), test (|X te | = 83) and validation (|X vl | = 84).We then use orthogonal matching pursuit [4] (OMP) to compute the m significant features of the training set.Next, we extract these features for the test set and use them to compute the centroids {c j } d j =1 for each cluster.To evaluate these clusters K -score (i.e., the standard k-means objective) on the transformed features . ., 7 (number of clusters).Figure 13 summarizes the results of this test, while Table 3 shows the full numerical results.In this experiment, the GHWT outperformed all other bases because the trajectories are roughly constant and locally supported.The orthogonal matching pursuit scheme can select elements with the correct support size, and the piecewise-constant nature of the GHWT atoms can capture the action of the trajectory with very few elements.

Conclusions and Future Work.
In this article, we have developed several generalizations of orthonormal bases and overcomplete transforms/dictionaries for signals defined on κ-simplices, and demonstrated their usefulness for data representation on both illustrative synthetic examples and real-world simplicial complexes generated from a coauthorship/citation dataset and an ocean current/flow dataset.However, there are many more tools from harmonic analysis that we have not addressed in this article.From a theoretical standpoint, future work may involve: 1) defining additional families of multiscale transforms such as the extended Generalized Haar-Walsh Transform (eGHWT) [49] and Natural Graph Wavelet Packets (NGWPs) [8]; 2) exploring different best-basis selection criteria tailored for classification and regression problems such as the Local Discriminant Basis [41,43] and the Local Regression Basis [42] on simplicial complexes; and 3) investigating nonlinear feature extraction techniques such as the Geometric Scattering Trans-form [15].From an application standpoint, we look forward to applying the techniques presented here to data science problems in computational chemistry, weather forecasting, and genetic analysis, all of which have elements that are naturally modeled with simplicial complexes.

Fig. 2 :
Fig. 2: Pairs of κ-simplices demonstrating consistency at their boundary face, for κ = 1, 2. The mixed-color pairs are consistent, and the same-color pairs are inconsistent.
e., level j = 0 represents the root node of this tree.Then the two children of C j k in the tree, C j +1 k ′ and C j +1 k ′ +1 , are obtained through partitioning C j k using the Fiedler vector of L rw κ (C j k

Fig. 5 :
Fig. 5: The 2-Haar basis vectors for the bipartition shown in Figure 4.The yellow, dark green, violet regions in each vector indicate its positive, zero, and negative components.

Algorithm 1 :
Generating κ-Haar Basis Data: C j k j ,k : A hierarchical bipartition tree of the κ-simplices C κ as defined in Section 4.2, K j denotes the number of subregions on the level j , n j k C j k Result: An unnormalized κ-Haar Basis {ξ

Algorithm 2 :
Generating κ-HGLET DictionaryData: C j k j ,k : A hierarchical bipartition tree of the κ-simplices C κ as defined in Section 4.2, K j denotes the number of subregions on the level j , κ : κ-Hodge Laplacian of C κ

Algorithm 3 :
Generating κ-GHWT DictionaryData: C j k j ,k : A hierarchical bipartition tree of the κ-simplices C κ as defined in Section 4.2, K j denotes the number of subregions on the level j ,

Fig. 12 :
Fig. 12: Top: Nonlinear approximation of the Citation Complex for κ = 0, . . ., 5. Bottom: Log of the error for up to 50% of the terms retained.

Fig. 13 :
Fig.13: Extensive results for buoy cluster test.Leftmost figure shows which method preformed best, the second to the left shows the second best and so on.The x-axis in each subplot indicates the number of coefficients used and the y-axis is the number of clusters.Full numerical results are presented in Table3.

Table 2 :
Test Accuracy for SVMs trained on transforms of MNIST signals interpolated to a random triangulation