3.1 Introduction

Hypergraph computation can be roughly divided into three types: representation learning of a hypergraph, where each subject is represented by a hypergraph of its components, representation learning of vertices in the hypergraph, where each subject is a vertex in the hypergraph, and hypergraph structure prediction, which aims to find the connections among vertices. There are three types of computation paradigms that can be named intra-hypergraph computation, inter-hypergraph computation, and hypergraph structure computation. In this chapter, we introduce the generalized computation paradigms corresponding to these three directions and show how to formulate practical tasks in these hypergraph computation frameworks. We note that specific implementations of generalized functions in the paradigm are not introduced here, as they are parts of specifically defined functions or modules in the hypergraph computation framework and will be introduced in subsequent chapters.

3.2 Intra-hypergraph Computation

Intra-hypergraph computation targets on learning the representation of a single subject using the inside component information, in which the correlations among the components of this subject are formulated in a hypergraph. In this hypergraph, the components of this subject are regarded by the set of vertices, and their high-order correlations are modeled by hyperedges. In this way, the individual subject is transformed into a hypergraph. As this hypergraph is generated by the subject’s components themselves, we can name this hypergraph as the intra-hypergraph of this subject.

Image representation and understanding [1,2,3] are typical intra-hypergraph computation applications. For example, an image can be split into a group of patches, and each patch is denoted by a vertex in the hypergraph. The hypergraph can be generated according to the semantic and spatial information of these patches. The information of these patches and their high-order correlations can be then used simultaneously to learn the representation for the image.

The general paradigm of intra-hypergraph computation can be described as follows. Given a target subject that contains n components, that are represented by feature vectors \(\mathbf {X}\in \mathbb {R}^{n \times d}\). An intra-hypergraph \(\mathbb {G}\) can be generated to formulate the high-order correlations inside the subject, whose incidence matrix is denoted by H. The representation of the individual subject can be learned by

$$\displaystyle \begin{aligned} {\mathbf{Z}}_{\mathbb{G}} = f_{\varTheta}(\mathbf{H},\mathbf{X}), \end{aligned} $$
(3.1)

where Θ denotes the to-be-learned parameters. The function f Θ(⋅) can be the neural network layers or other computing operators that aggregate the information of vertices together based on the hypergraph structure. Intra-hypergraph computation integrates the complex correlations among components into the learned representation, which can extract more information than simple aggregation operations.

In this paradigm, the subject to be analyzed is regarded as a whole system, and the intra-hypergraph is to model the correlation inside the system. This process is shown in Fig. 3.1.

Fig. 3.1
A process diagram. Raw data from a single subject converts to a matrix in meta elements and computes an intra-hypergraph with 6 vertices and 3 hyperedges, e 1 through 3. It then generates an inter-hypergraph for 8 subjects with 8 vertices and 4 hyperedges, e 1 through 4.

An illustration of intra-hypergraph computation and inter-hypergraph computation

3.3 Inter-hypergraph Computation

Inter-hypergraph computation targets at learning the representation of a subject by considering its correlations with other subjects. In this hypergraph, each subject, including the target one, is regarded by the set of vertices, and their high-order correlations are modeled by hyperedges. In this way, this group of subjects is transformed into a hypergraph. As this hypergraph is generated by the cross-subject correlations, we can name this hypergraph as the inter-hypergraph of this subject. Subject classification and retrieval [4,5,6,7] are typical inter-hypergraph computation applications. For example, we take an image as the target subject, and we can also have a pool of images for processing. Each image can be denoted by a vertex in the hypergraph. The hypergraph can be generated according to the semantic and spatial information of these images. The information of these images and their high-order correlations can be then used simultaneously to learn the representation of the target image.

The general paradigm of inter-hypergraph computation can be described as follows. Given a target subject and other n − 1 subjects, represented by feature vectors \(\mathbf {X}\in \mathbb {R}^{n \times d}\), an inter-hypergraph \(\mathbb {G}\) can be generated to formulate the high-order correlations among these subjects, whose incidence matrix is denoted by H. The representation of the target subject can be learned by

$$\displaystyle \begin{aligned} {\mathbf{Z}}_{\mathbb{V}} = f_{\varTheta}(\mathbf{H},\mathbf{X}). \end{aligned} $$
(3.2)

The vertex embedding can be further used for the downstream tasks, such as vertex classification, where the vertices are associated with pre-defined labels Y ∈ [K]n. This process is also shown in Fig. 3.1.

It is noted that a hypergraph structure can be either homogeneous or heterogeneous, depending on the definition of vertices. Given multiple types of data, or multi-modal data, another way to formulate such correlations is to generate multiple hypergraphs accordingly. For example, supposing that there are m types of features or modalities, denoted by X 1, X 2, …, X m, we can construct one hypergraph for each modality respectively. In this way, we can have m hypergraphs \(\mathbb {G}_1 = (\mathbb {V}_1; \mathbb {E}_1; {\mathbf {W}}_1); \mathbb {G}_2 = (\mathbb {V}_2; \mathbb {E}_2; {\mathbf {W}}_2); ... ;\mathbb {G}_m = (\mathbb {V}_m; \mathbb {E}_m; {\mathbf {W}}_m)\) for the data with m modalities. The general paradigm for multi-modal inter-hypergraph computation can be described as

$$\displaystyle \begin{aligned} {\mathbf{Z}}_{\mathbb{V}} = f_{\varTheta}({\mathbf{H}}_1, {\mathbf{H}}_2, \dots, {\mathbf{H}}_m, {\mathbf{X}}_1, {\mathbf{X}}_2, \dots, {\mathbf{X}}_m), \end{aligned} $$
(3.3)

where H 1, H 2, …, H m are the incidence matrices of the m hypergraphs.

3.4 Hypergraph Structure Computation

Hypergraph structure computation aims to learn the high-order correlations among data in the presence of missing links and inaccurate initial structure. There are two scenarios in which hypergraph structure computation is performed: either the set of hyperedges is incomplete or the affiliation relationships between vertices and hyperedges are incomplete. Recommender system and drug discovery [8,9,10] are typical hypergraph structure computation applications. For example, in recommender system, the hyperedges describe the connections between items and users with specific semantics. The number of hyperedges is fixed, and the features of both vertices and hyperedges can be obtained as the input. Here, the target of hypergraph structure computation is to predict whether a vertex belongs to a hyperedge or not. If a new hyperedge is predicted, we can have new link to indicate the connections. However, in a knowledge hypergraph, the hyperedges display the facts in the real world, which are usually highly incomplete. The missing links are expected to be inferred based on existing links by hypergraph structure computation. Therefore, in the second case, the objective of hypergraph structure computation is not only optimizing existing links but also inferring the unobserved links.

In the following, we describe the computation paradigms of these two cases separately. The first scenario is that the set of hyperedges is complete and the affiliation relationships between vertices and hyperedges are incomplete. In this case, we usually can extract a feature vector for each hyperedge representation. Given the input of vertex features \({\mathbf {X}}_{\mathbb {V}}\) and hyperedge features \({\mathbf {X}}_{\mathbb {E}}\), we can calculate the incidence matrix by the function related to the vertex and hyperedge features as

$$\displaystyle \begin{aligned} {\mathbf{H}}^{*} = f_{\varTheta}({\mathbf{X}}_{\mathbb{V}}, {\mathbf{X}}_{\mathbb{E}}). \end{aligned} $$
(3.4)

For example, the attention score can be used as an instance of the function in practice.

In the second scenario, if there are missing hyperedges in the observed hypergraph and the semantics of hyperedges are ambiguous, it is difficult to directly describe the hyperedges by features. Consequently, only the initial incomplete hypergraph structure and the features of vertices can be available as the input. We denote the incidence matrix of the initial hypergraph structure by H (0). The computation paradigm can be written as

$$\displaystyle \begin{aligned} {\mathbf{H}}^{*} = f_{\varTheta}({\mathbf{X}}_{\mathbb{V}}, {\mathbf{H}}^{(0)}), \end{aligned} $$
(3.5)

which indicates that the new hypergraph structure is updated based on the original hypergraph structure following specific prior information.

To guide the evolution of hypergraph structure to more accurately model data correlation, it is necessary to evaluate the quality of hypergraph structure based on the training data and prior information. If there is part of ground truth information about the hypergraph structure, the performance of correlation modeling can be evaluated directly. However, there is no golden standard for hypergraph structure in most cases. Therefore, we may need to perform downstream tasks using the new hypergraph and indirectly evaluate hypergraph computation performance through the downstream task results. Here, we refer to Fig. 3.1, and hypergraph structure computation can be conducted under the intra- and inter-hypergraph computation frameworks.

3.5 Summary

In this chapter, we introduce three hypergraph computation paradigms for different scenarios. These three paradigms are intra-hypergraph computation, inter-hypergraph computation, and hypergraph structure computation, which focus on learning the representation of a single subject using the inside component information, learning the representation of a subject by considering its correlations with other subjects, and learning the high-order correlations among data in the presence of missing links and inaccurate initial structure. This chapter provides an overview of how to use hypergraph computation, and the detailed hypergraph computation theory, methods, and application will be introduced in the following chapters.