## Abstract

Electronic systems living on *Archimedean* lattices such as kagome and square–octagon networks are presently being intensively discussed for the possible realization of topological insulating phases. Coining the most interesting electronic topological states in an unbiased way is however not straightforward due to the large parameter space of possible Hamiltonians. A possible approach to tackle this problem is provided by a recently developed statistical learning method (Mertz and Valentí in Phys Rev Res 3:013132, 2021. https://doi.org/10.1103/PhysRevResearch.3.013132), based on the analysis of a large data sets of randomized tight-binding Hamiltonians labeled with a topological index. In this work, we complement this technique by introducing a *feature engineering* approach which helps identifying polynomial combinations of Hamiltonian parameters that are associated with non-trivial topological states. As a showcase, we employ this method to investigate the possible topological phases that can manifest on the square–octagon lattice, focusing on the case in which the Fermi level of the system lies at a high-order van Hove singularity, in analogy to recent studies of topological phases on the kagome lattice at the van Hove filling.

### Similar content being viewed by others

Avoid common mistakes on your manuscript.

## 1 Introduction

The majority of structural and electronic properties of crystals are ultimately determined by the spatial arrangements of their atoms, which form a lattice structure that repeats periodically in space. In two dimensions, the enumeration of the lattice structures is connected to the problem of finding the possible *tessellations* of a planar surface, i.e., the different ways to cover an infinite plane by a repeated juxtaposition of certain geometrical shapes (*tiles*). When taking a single regular polygon as tile, only three possible networks which are homogeneous with respect to vertices, tiles and edges can be formed: the triangular, square and honeycomb lattices [1]. These periodic structures are usually referred to as *Platonic* lattices and are ubiquitous in condensed matter systems. If the condition of homogeneity is loosened and one is allowed to employ different regular polygons as tiles, the so-called *Archimedean* lattices can be constructed, which are homogeneous only with respect to vertices [2].

The square–octagon lattice forms one of the eleven possible Archimedean tessellations of the two-dimensional plane [1]. It consists of a repetition of regular square and octagonal tiles, whose vertices define a crystal structure with a four-site unit cell repeated over an underlying square Bravais lattice (see Fig. 1a). The simple tight-binding treatment of the square–octagon nearest-neighbor network reveals rather intriguing properties of the electronic band structure: as shown in Fig. 1b, at 1/4 and 3/4 fillings, the energy dispersion shows a partially flat band intersecting two linearly dispersing bands, which form a Dirac cone. The flat dispersion results in the presence of a high-order van Hove singularity [3], namely a power-law divergence of the density of states, which is expected to enhance the effects of electronic correlations and favor the emergence of Fermi surface instabilities [4, 5]. From a theoretical perspective, the question of the role of van Hove singularities for the onset of topological phases has been intensively investigated in the recent past in the context of the charge-density wave phase of AV\(_3\)Sb\(_5\) kagome metals (with A=K, Rb, Cs) [6,7,8]. Several works suggested the existence of a topological flux phase among the possible instabilities of the kagome band structure at the van Hove filling [9,10,11,12,13]. It is worth noting that although a simple tight-binding approach on the kagome lattice yields a band structure with conventional van Hove singularities, recent photoemission experiments detected the presence of a high-order van Hove singularity (close to the Fermi energy) in the band structure of CsV\(_3\)Sb\(_5\) [14]. In this regard, the peculiar electronic dispersion of the square–octagon lattice is an intriguing minimal playground to investigate the possible onset of topological phases when the Fermi level of the system cuts through a high-order van Hove singularity.

It is worth mentioning that there are various proposals of two-dimensional compounds with a square–octagon geometry, such as monolayers of nitrogen group elements [15], metal nitrides and carbides [16], a possible allotrope of monolayer MoS\(_2\) [17], or two-dimensional polymers [18, 19]. Most importantly, several synthesis routes to fabricate T-graphene (octagraphene), a tetrasymmetrical carbon allotrope with a square–octagon periodic structure, have been put forward recently [20,21,22,23,24]. The square–octagon lattice is also found as a two-dimensional section of three-dimensional crystals, e.g., in the *xz*-plane of the *Hollandite* structure, which is characteristic of certain Mn-oxides [25,26,27,28,29,30]. Additionally, the one-fifth depleted square lattice which describes the periodic arrangement of vanadium atoms in the antiferromagnetic CaV\(_{4}\)O\(_{9}\) compound [31,32,33] is topologically equivalent to the square–octagon lattice (at first-neighbors) and often referred to as the CaVO lattice. In the past decades, several studies investigated Heisenberg-like models on the CaVO/square–octagon lattice in the context of frustrated magnetism [34,35,36,37,38,39,40,41,42,43]. On the other hand, more recently, a number of theoretical works have focused on different electronic models on the square–octagon lattice, with a focus on topological properties [19, 44,45,46,47,48] and superconductivity [49, 50], mostly motivated by the synthesis of T-graphene [24].

In this work, we explore the possible topological phases that can manifest on the square–octagon lattice at 1/4 filling, i.e., when the Fermi level of the system lies precisely at the high-order van Hove singularity. Our study is based on a recently developed statistical learning method [13, 51, 52], in which a large data set of randomized tight-binding Hamiltonians is generated and subsequently analyzed by statistical tools drawn from machine learning approaches, with the purpose to gain insightful information on possible topological phases. We complement the methodology outlined in Ref. [51] by introducing *feature engineering* as a tool to identify physical observables that are associated with non-trivial topological phases [52].

The paper is organized as follows: Sect. 2 is devoted to the description of the statistical method and discusses the concepts of marginal probability distributions, importance score, and feature engineering, which are employed for the data analysis; in Sect. 3 we actualize the statistical method to the specific case of a square–octagon electronic system, defining the general form of the Hamiltonian; in Sect. 4 we discuss the results of the statistical study, iterating several processes of dimensional reductions of the feature space in order to reach a minimal description of the topological phases; finally, in Sect. 5 we summarize our findings.

## 2 Method

Within the framework of the statistical method introduced in Ref. [51], one can explore topological phases on an arbitrary lattice by considering fermionic tight-binding Hamiltonians of the following type

The two parts of the Hamiltonian consist of onsite potentials (\(\epsilon _i \in {\mathbb {R}}\)) and complex-valued hopping terms (\(t_{i,j} \in {\mathbb {C}}\)), respectively. The \(t_{i,j}\) hopping integrals are taken to be translationally invariant and assumed to vanish when the distance between sites *i* and *j* exceeds a certain (arbitrary) threshold. For this reason, we conveniently introduce the notation \(t^n_s\) for the various independent hopping terms, where *n* runs over all possible Euclidean distances \(R_n\), sorted in ascending order, and *s* is an index denoting the inequivalent bonds at distance *n*. The electronic filling of the system is fixed by successively filling a certain number of energy bands.

The independent onsite potentials and hopping parameters of the Hamiltonian are dubbed *features* and collected into the vector \(\vec {x}=(x_1,\dots ,x_{N_f})\), with \(N_f\) indicating the number of features. Within the statistical method, a certain choice of the entries of \(\vec {x}\) is referred to as *sample* and fully determines the electronic properties of the Hamiltonian. At a given filling, each sample can be characterized by the bandgap \(E_g\), which classifies samples into metals and insulators. If a sample is insulating, we can assign it a topological index, named *label*, which distinguishes trivial and topological samples. In the following, we choose the first Chern number *C* as label [53,54,55].

For an unbiased statistical analysis of a particular lattice, we generate a large number of different samples, i.e., different tight-binding Hamiltonian. This can be done by randomly picking tight-binding parameters \(x_i\) according to a certain probability distribution function (PDF), e.g., a uniform or a Gaussian distribution. The generated data set is then analyzed by calculating the *marginal probability distribution functions*

for each feature \(x_i\) and label *C*. Here, \(\rho _C(\vec {x})\) is the bare PDF of all the *insulating* samples with Chern number *C*. Therewith, by inspecting the properties of the marginal PDFs \(p_C(x_i)\) of the various features \(x_i\), we can determine the feature values which are most descriptive for the phase with Chern number *C*. This allows us to identify, for example, which patterns of hopping parameters is associated to a certain topological phase.

We note that features are in general complex-valued. For gaining most insight, one can examine the marginal PDFs for the real part (\({{\,\textrm{Re}\,}}[x_i]\)), imaginary part (\({{\,\textrm{Im}\,}}[x_i]\)), modulus \((|x_i|)\) and phase (\(\varphi [x_i] \equiv \arg [x_i]\)) of each feature \(x_i\). The contrast between marginal PDFs for topological phases, i.e., \(p_{C\ne 0}\), and marginal PDFs for trivial phases, i.e., \(p_{0}\), indicates by which features a particular topological phase is characterized. In this regard, a quantitative measure of the importance of a certain feature \(x_i\) for the topological phase with index *C* is given by the Bhattacharyya distance [56] between the topological and trivial PDFs, namely

This quantity, referred to as *importance score* in the following, allows to perform a dimensional reduction of the feature space, by omitting the features with lowest values of \(D_B(p_{C\ne 0}, p_{0})\) in the course of the statistical analysis. It is worth mentioning that although other statistical distances between probability distributions can be adopted for the definition of the importance score (e.g., the Hellinger distance [57]), a previous benchmark study on the honeycomb lattice has shown that the Bhattacharyya distance provides a better contrast of the marginal distributions with respect to other metrics [51, 52]. Complementary to the use of the importance score, a model can be refined by establishing symmetries between features, as obtained either from physical grounds or from the behavior of the marginal PDFs [13]. An iterative application of the statistical method, involving subsequent data generation, dimensionality reduction and analysis of the marginal PDFs, leads to the definition of effective models for topological phases.

Furthermore, in the present work we pursue a better understanding of the parametrization of topological phases by introducing a *feature engineering* procedure. We define additional composite features by taking certain combinations, e.g., sums, products or power series, of (some of) the original features and compute their corresponding importance score as the Bhattacharyya distance between the trivial and topological marginal PDFs. Some of these engineered features may carry higher importance score and serve as particularly outstanding descriptors of a particular phase.

Employing the statistical method outlined in this section, complemented by feature engineering, we tackle the study of topological states that can manifest in the square–octagon lattice.

## 3 Lattice and model

The square–octagon lattice, sketched in Fig. 1a, is defined by a square Bravais lattice and a unit cell of four lattice sites. Denoting the Bravais lattice vectors by \(\textbf{a}_1= (1,0)\) and \(\textbf{a}_2= (0,1)\), the four sites inside the unit cell can be placed at positions \(\pm {\sqrt{2}}/{2}\ \textbf{a}_1\) and \(\pm {\sqrt{2}}/{2} \ \textbf{a}_2\). To investigate possible topological phases on this lattice we consider a spinless tight-binding Hamiltonian of the form of Eq. (1), with hopping terms being restricted from first to fourth-neighboring sites. Assuming translational invariance, the model contains four onsite potentials, with parameters \(\epsilon _{1\le s\le 4}\), and a total of 28 hopping parameters. As shown by the different colored lines in Fig. 1a, the 28 hoppings are divided into six first-neighbor terms \(t^1_{1\le s\le 6}\) (green lines), two second-neighbor terms \(t^2_{s=1,2}\) (blue lines), eight third-neighbor terms \(t^3_{1\le s \le 8}\) (orange lines), and twelve fourth-neighbor terms \(t^4_{1\le s\le 12}\) (dashed red lines; for the sake of clarity, only three symmetry-inequivalent links are shown).

The band structure for the model with uniform onsite terms, \({\epsilon _{1\le s\le 4} = 1}\), and uniform first-neighbors hoppings, \({t^1_{1\le s\le 6} = -1}\) (\({t^{n>1}_s=0}\)), is shown in Fig. 1b. The system is metallic for any filling. The dashed horizontal line indicates the Fermi energy at 1/4 filling, where the dispersion is characterized by a triply degenerate point at *M*, formed by the lowest-lying three bands and consisting of a Dirac node and a (partially) flat band. Another band crossing of the same type, formed by the upper three bands, occurs at the \(\Gamma\) point.

By tuning the hopping parameters it is possible to create topological insulators, i.e., open a gap in the energy bands and induce nonzero Chern numbers. In the following, we will infer which parameters have to be manipulated in order to create topological insulators. We focus on the case of 1/4 filling, for which the Fermi energy intersects the lower triply degenerate band crossing. The presence of partially flat bands gives rise to high-order van Hove singularities in the density of states [5], which implies a strong susceptibility of the system toward symmetry breaking in the presence of electron–electron interactions [3, 4]. The Fermi surface of the system coincides with the edges of the Brillouin zone, i.e., it can be seen as the square connecting the *M* points. Along its vertical (horizontal) edge, i.e., \(\textbf{k}=(\pi ,k)\) [\(\textbf{k}=(k, \pi )\)], the Fermi surface displays a mixed sublattice character, with the Bloch waves being evenly localized on sublattice sites 1 and 3 (2 and 4).

## 4 Statistical analysis

For the statistical analysis of the square–octagon lattice we begin by considering the tight-binding model of Eq. (1) with hopping terms up to fourth-neighbor bonds. In order to randomly sample the feature space, we define a set of reference values for each feature, generally denoted as \(x_i^{\rm ref}\) [51]. The samples are drawn according to a multivariate (two-dimensional) Gaussian distribution in the complex plane, centered in \(x_i^{\rm ref}\in {\mathbb {C}}\) and with covariance matrix \({\Sigma =\alpha ^2|x_i^{\rm ref}|^2 \mathbb {1}_{2\times 2}}\), where \(\alpha \in {\mathbb {R}}\) is an arbitrary hyperparameter. Analogously, for real-valued features, i.e., onsite potentials, a one-dimensional Gaussian is employed. As reference points for the various features, we take \({\epsilon _{1 \le s \le 4}^{\rm ref} = 0.25}\), \({t^{1, {\rm ref}}_{1\le s\le 6} = -1}\), \({t^{2, {\rm ref}}_{s=1,2} = {-1}/{\sqrt{2}}}\), \({t^{3, {\rm ref}}_{1\le s\le 8} = {-1}/{\sqrt{2+\sqrt{2}}}}\) and \({t^{4, {\rm ref}}_{1\le s\le 8} = {-1}/{(1+\sqrt{2})}}\). Note that the reference points of the hopping terms are scaled by the inverse distance between *n*th neighbors, i.e., \(1/R_n\). For the width of the Gaussian PDFs, we take \(\alpha =0.6\). This scheme allows us to consider physical Hamiltonians where extreme values of tight-binding parameters are excluded [51]. For example, within our parametrization, the choice \(\alpha =0.6\) ensures that the real part of the extracted features does not change sign with respect to the reference point for most samples (\(\approx 95\%\)). We verified that small changes of \(\alpha\) with respect to the above choice do not affect the results significantly. However, in general, extreme values of \(\alpha\) shall be avoided. Indeed, if \(\alpha\) is too small the sampling is limited to Hamiltonians which are close to the reference point and does not cover a significant amount of the feature space; on the other hand, for a fixed number of samples, choosing a larger value of \(\alpha\) leads to noisier marginal PDFs, which may hamper the statistical analysis. We note that the choice of the reference point constitutes the main bias of the present approach. The simplest way to alleviate this bias involves choosing different initial reference points to cover a larger portion of the feature space. The choice can be based on an iterative application of the statistical method: once a set of parameters yielding a certain topological phase is identified, one can perform a new statistical analysis centered around the topological reference point, thus exploring the feature space around it. On the other hand, biasing the results around a certain reference point can be desirable in the case in which the present method is applied to a specific physical system. For example, if one is interested in exploring topological phases for a certain target material, the reference point can be chosen to be an *ab initio* determined tight-binding Hamiltonian [52].

After creating a data set of \(n_S = 2 \cdot 10^7\) samples on the square–octagon lattice, we find 3.3% insulators out of all samples, 17.6% of which are topological. Nearly all topological insulators (99.6%) have Chern index \(C=\pm 1\). As we sample in a large parameter space, the number of topologically non-trivial samples is small. Hence, we proceed attempting a dimensional reduction in order to infer more information on the topological phases.

### 4.1 Dimensional reduction

The parameter space can be reduced by examining the importance scores \(D_B(p_{1}(x_i),p_0(x_i))\) and \(D_B(p_{-1}(x_i),p_0(x_i))\) for the \(C=\pm 1\) phases, which constitute the majority of topological samples. The importance score of each feature is the same for \(C=1\) and \(C=-1\), because the underlying marginal PDFs show specular behavior with respect to the \({{\,\textrm{Im}\,}}(x_i)=0\) axis in complex plane for opposite Chern numbers. As shown in Fig. 2, we observe zero importance for onsite terms. Hence, the parameters \(\epsilon _i\) do not play any role in distinguishing trivial and topological phases, and can thus be excluded from the statistical analysis. On the other hand, the importance score of all hopping parameters is finite.

Similar values of the importance scores, which vary up to statistical noise due to the finite sample count of topological insulators, indicate the presence of sub-groups of hoppings, as expected from the inherent symmetry of the square–octagon lattice. Indeed, we can distinguish two classes of first-neighbor hoppings, according to their importance score: (i) bonds within square plaquettes \(\{t^1_1, t^1_2, t^1_3, t^1_4\}\) and (ii) bonds connecting square plaquettes \(\{t^1_5, t^1_6\}\). Also fourth-neighbor hoppings can be grouped in three classes: (i) bonds crossing the square plaquettes \(\{t^4_1, t^4_2, t^4_3, t^4_4\}\) (vertical red dashed line in Fig. 1a), (ii) bonds crossing the octagonal plaquettes and connecting sites belonging to the same sublattice \(\{t^4_5, t^4_6, t^4_7, t^4_8\}\) (horizontal red dashed line in Fig. 1a) and (iii) bonds crossing the octagonal plaquettes and connecting sites belonging to different sublattices \(\{t^4_9, t^4_{10}, t^4_{11}, t^4_{12}\}\) (diagonal red dashed line in Fig. 1a). For what concerns second-neighbor hoppings (\(t^2_{s=1,2}\)) and third-neighbor hoppings (\(t^3_{1\le s \le 8}\)) no distinction into subgroups can be made based on the importance score.

Among all hoppings, the lowest importance is shown by the fourth-neighbor hoppings which connect sites belonging to the same sublattice, i.e., classes (i) and (ii). Therefore, based on this observation, we omit these parameters (and the onsite potentials) in the next iteration of our analysis. A new sampling procedure with the reduced model yields a remarkably larger number of insulators (64% of all samples) and shows a rather low importance for the second-neighbor hopping terms, which is approximately three times lower than the importance of third- and fourth-neighbor terms (not shown). Hence, based on this observation, we also exclude the second-neighbor hoppings from our model, in order to scale down the size of the feature space. This will enhance the contrast between the marginals PDFs and thus simplify the subsequent analysis.

We are thus left with a model including only first-, third-, and fourth-neighbor hoppings of class (iii) (i.e., those connecting sites belonging to different sublattices). Note that, for simplicity, we will refer to the latter as “fourth-neighbor hoppings” in the remainder of the paper. The new data set contains 64% insulators, 18.2% of which possess a non-trivial Chern index \(C=1\) or \(C=-1\). The fraction of insulators with higher Chern number is negligibly small. Compared to the previous iterations, we observe a higher portion of topological insulators due to the reduced parameter space (11.6% out of all samples, against 0.57% for the full model including onsite terms and all hoppings up to fourth-neighbors).

We can gather information on topological phases from this model by considering the marginal probability distributions. Based on their appearance, the PDFs of first-neighbor hoppings can be grouped in two subsets, one formed by the hoppings inside the square plaquettes {\(t^1_{1}\), \(t^1_{2}\), \(t^1_{3}\), \(t^1_{4}\)}, and the other containing hoppings that connect adjacent plaquettes {\(t^1_{5}\), \(t^1_{6}\)}, see Fig. 1a. The marginal distributions for third- and fourth-neighbors, respectively, show the same behavior among each type. One exemplary set of the PDFs of imaginary parts \(p_C({{\,\textrm{Im}\,}}[t^n_s])\), which indicate the “directions” of the complex hoppings, and PDFs of the moduli \(p_C(|t^n_s|)\), which describe the overall hopping strengths, is shown in Fig. 3 for each group of hoppings. From these PDFs we can infer the most descriptive features characterizing trivial and topological phases, as discussed in the following.

#### 4.1.1 Trivial \(C=0\) phase

In the trivial phase, the marginal PDFs for the imaginary parts of all hoppings shown in Fig. 3, i.e., \(p_0({{\,\textrm{Im}\,}}[t^1_1])\), \(p_0({{\,\textrm{Im}\,}}[t^1_5])\), \(p_0({{\,\textrm{Im}\,}}[t_1^3])\) and \(p_0({{\,\textrm{Im}\,}}[t_5^4])\) show a perfect symmetric behavior around zero. We can thus infer that no specific hopping direction is preferred. The PDFs of the moduli \(p_0(|t^1_1|)\), \(p_0(|t^1_5|)\), \(p_0(|t_1^3|)\) and \(p_0(|t_5^4|)\) show similar shapes, with a nonzero mean indicating finite bond strengths. Hence, the trivial insulating phase can be realized by finite first-, third- and/or fourth-neighbor hoppings, with no specific hopping direction (e.g., by real hoppings). This configuration is schematically illustrated in the left panel of Fig. 4, where we color the relevant bonds within one unit cell.

#### 4.1.2 Topological \(C=\pm 1\) phases

For first-neighbor hoppings within the square plaquettes, exemplified by the term \(t^1_{1}\) in Fig. 3, we observe that the marginal PDFs for nonzero Chern numbers, i.e., \(p_{\pm 1}({{\,\textrm{Im}\,}}[t^1_1])\) and \(p_{\pm 1}(|t^1_1|)\), look rather distinct from the PDFs of the trivial phase. For the \(C=1\) phase, \({{\,\textrm{Im}\,}}[t^1_1]\) tends to be larger than zero which corresponds to a counter-clockwise winding of the hoppings around the square plaquettes. \(p_{-1}({{\,\textrm{Im}\,}}[t^1_1])\) is the conjugate of \(p_{1}({{\,\textrm{Im}\,}}[t^1_1])\), hence the winding is clockwise. The modulus \(|t_1^1|\), i.e., the overall hopping strength, shows larger values for topological phases than for the trivial phase. For what concerns the remaining first-neighbor hoppings, represented by the term \(t^1_{5}\) in Fig. 3, we observe that \(p_{\pm 1}({{\,\textrm{Im}\,}}[t^1_5])\) is symmetric around zero, i.e., no particular hopping direction is indicated and, thus, these hoppings do not play a role in differentiating between \(C =\pm 1\) and \(C=0\) phases. At variance with the case of \(|t_1^1|\), the marginal PDFs of \(|t_5^1|\) have similar means for \(C=\pm 1\) and \(C=0\) phases.

As shown in Fig. 3 by the representative term \(t^3_{1}\), also the marginal PDFs for the imaginary part of third-neighbor hoppings behave differently for topological and trivial phases: \({{\,\textrm{Im}\,}}[t^3_1]\) tends to be larger than zero for \(C=1\), while for \(C=-1\) it shows a tendency to be smaller than zero. This implies that nonzero third-neighbor bonds with complex hoppings winding clockwise (anticlockwise) in the octagonal plaquettes can support the non-trivial \(C=1\) (\(C=-1\)) phase. On the contrary, the marginal PDFs of the moduli \(p_C(|t^3_1|)\) look identical for \(C=\pm 1\) and \(C=0\) and, thus, they provide no information about the topological properties. Finally, we observe that the marginals of fourth-neighbor hoppings, represented by \(t_5^4\), show qualitatively the same behavior as the marginals for the third-neighbor bonds. Hence, also nonzero fourth-neighbor bonds with hopping directions winding clockwise (anticlockwise) can support the non-trivial \(C=1\) (\(C=-1\)) phase.

In summary, a topologically insulating phase with \(C=1\) can be induced by anticlockwise first-neighbor hoppings on the square plaquettes, which are relatively stronger than the bonds connecting square plaquettes, together with clockwise third- and fourth-neighbor hoppings. Topological insulators with \({C=-1}\) can be created by reversing the hopping directions of the \(C=1\) phase. These results are schematically summarized by the sketches in the middle and right panel of Fig. 4. Here, the thickness of the bonds reflects the relative hopping strengths and the arrows illustrate the hoppings directions, i.e., the sign of their imaginary parts.

### 4.2 Toward a first-neighbor model and feature engineering

To gain a deeper understanding of the phases which can manifest in the square–octagon lattice, we continue with a reduction of parameters based on the importance scores. Within the tight-binding model with first-, third- and fourth-neighbor hoppings discussed in the previous section, the importance score of first-neighbor terms turns out to be up to eight times larger than the importance score of third- and fourth-neighbor terms. Based on these observations we exclude third- and fourth-neighbor terms as the next step of our analysis.

Creating a data set for the Hamiltonian with only first-neighbor hoppings yields 95.5% insulating samples. The fraction of topological insulators corresponds to 13.7% of all samples, analogously to what has been observed in the calculation with first-, third- and fourth-neighbor hoppings. This further indicates the higher importance of first-neighbor hoppings for the topologically non-trivial phases with \({C=\pm 1}\). With the first-neighbor model we arrive at the minimal possible description for topological phases on the square–octagon lattice. As done previously, we can group first-neighbor hoppings into subsets based on the behavior of marginal PDFs (not shown): (i) hoppings that form square plaquettes, \(\{t^1_1, t^1_2, t^1_3, t^1_4\}\), and (ii) hoppings connecting different squares \(\{t^1_5, t^1_6\}\).

In order to try gaining additional information on the topological phases, we apply feature engineering, namely we define new composite features by taking all possible products involving (distinct) first-neighbor hoppings, i.e., pair-wise products of the form \(t_s^1 t_{s'}^1\), triple products of the form \(t_s^1 t_{s'}^1 t_{s''}^1\), and so on, up to the product of all six first-neighbor hoppings. We then calculate the importance score for the newly engineered features and identify the ones which play a major role in characterizing the topological phases.

As shown in Fig. 5, the product of all hoppings on the square plaquettes, namely \(t^1_1 t^1_2 t^1_3 t^1_4\), turns out to possess a remarkably large importance score, \(D_B = 0.45\) (c.f. \(D_B \approx 0.05\) for \(t^1_{1\le s \le 6}\)). For this particular engineered feature, the marginal PDFs in the complex plane, shown in Fig. 6, provide crucial insight. In the \(C=0\) phase, the PDF of \(t_1 t_2 t_3 t_4\) is symmetric with respect to the real axis, as shown in Fig. 6a. On the other hand, the marginals for the topological phases (Fig. 6b, c) are completely localized in the upper and lower half of the complex plane for \(C=-1\) and \(C=1\), respectively. Hence, the importance score for distinguishing the two topological phases takes its maximal value, i.e., \(D_B(p_1(t^1_1 t^1_2 t^1_3 t^1_4), p_{-1}(t^1_1 t^1_2 t^1_3 t^1_4)) = \infty\). This implies that the distinct topological phases are unambiguously distinguished by this engineered feature. Physically, the topological phases are distinguished by the phase picked up after one loop in the square plaquette which is given by \(\varphi [t^1_1 t^1_2 t^1_3 t^1_4] = \varphi [t^1_1] + \varphi [t^1_2] + \varphi [t^1_3] + \varphi [t^1_4]\). Eventually, the engineered feature \(t^1_1 t^1_2 t^1_3 t^1_4\) may serve as the unique descriptor of the topological phases.

## 5 Summary

The statistical method introduced in Ref. [51] constitutes an effective procedure to identify possible topological phases that can be realized by a tight-binding Hamiltonian on a given lattice structure. We employed this technique to scrutizine the topological phases appearing at the high-order van Hove filling on the square–octagon lattice, which forms one of the eleven *Archimedean* tessellations of the two-dimensional Euclidean plane and is realized in a number of different materials. Starting from a generic tight-binding model with hoppings up to fourth nearest neighbors, we constructed a dataset of randomized Hamiltonians labelled by their Chern number as topological index. We then performed a statistical analysis of the marginal probability distributions for the various parameters of the system and, by means of dimensional reduction, we reached an effective model describing topological phases with Chern number \(C=\pm 1\) on the square–octagon lattice. Most importantly, we introduced a *feature engineering* procedure that allows us to gain deeper insight into the nature of the topological phases by identifying polynomial combinations of tight-binding parameters which are associated to non-trivial topology, e.g., Peierls-like fluxes. Going beyond the methodological improvements presented in this work and the results for the square–octagon lattice, the present statistical method can be regarded as a potential tool to perform a material-specific search of topological phases, by exploring the phase space around a tight-binding Hamiltonian obtained from first principles (e.g., by density-functional theory and Wannierization). The search of topological phases and its characterization by means of engineered features could serve as a guide to experimental manipulation of the target material to tune its properties toward desired topological phases, e.g., by means of applied pressure or strain.

## Data Availability Statement

This manuscript has associated data in a data repository. [Authors' comment: The data presented in this manuscript are available from the corresponding authors upon reasonable request].

## References

D. Chavey, Tilings by regular polygons–II: a catalog of tilings. Comput. Math. Appl.

**17**(1), 147–165 (1989). https://doi.org/10.1016/0898-1221(89)90156-9In other words, each vertex of these networks is surrounded by the same set of regular polygons

N.F.Q. Yuan, H. Isobe, L. Fu, Magic of high-order van hove singularity. Nat. Commun.

**10**(1), 5769 (2019). https://doi.org/10.1038/s41467-019-13670-9Y. Yamashita, M. Tomura, Y. Yanagi, K. Ueda, SU(3) Dirac electrons in the \(\frac{1}{5}\)-depleted square-lattice Hubbard model at \(\frac{1}{4}\) filling. Phys. Rev. B

**88**, 195104 (2013). https://doi.org/10.1103/PhysRevB.88.195104D.O. Oriekhov, V.P. Gusynin, V.M. Loktev, Orbital susceptibility of t-graphene: interplay of high-order van hove singularities and Dirac cones. Phys. Rev. B

**103**, 195104 (2021). https://doi.org/10.1103/PhysRevB.103.195104M.L. Kiesel, R. Thomale, Sublattice interference in the kagome Hubbard model. Phys. Rev. B

**86**, 121105 (2012). https://doi.org/10.1103/PhysRevB.86.121105B.R. Ortiz, L.C. Gomes, J.R. Morey, M. Winiarski, M. Bordelon, J.S. Mangum, I.W.H. Oswald, J.A. Rodriguez-Rivera, J.R. Neilson, S.D. Wilson, E. Ertekin, T.M. McQueen, E.S. Toberer, New kagome prototype materials: discovery of \({\rm kv}_{3}{{\rm sb}}_{5},{\rm rbv}_{3}{{\rm sb}}_{5}\), and \({\rm csv}_{3}{{\rm sb}}_{5}\). Phys. Rev. Mater.

**3**, 094407 (2019). https://doi.org/10.1103/PhysRevMaterials.3.094407T. Neupert, M. Michael Denner, J.-X. Yin, R. Thomale, M. Zahid Hasan, Charge order and superconductivity in kagome materials. Nat. Phys.

**18**(2), 137–143 (2022). https://doi.org/10.1038/s41567-021-01404-yM.M. Denner, R. Thomale, T. Neupert, Analysis of charge order in the kagome metal \(a{\rm v}_{3}{{\rm sb}}_{5}\) (\(a={{\rm K,Rb,Cs}}\)). Phys. Rev. Lett.

**127**, 217601 (2021). https://doi.org/10.1103/PhysRevLett.127.217601T. Park, M. Ye, L. Balents, Electronic instabilities of kagome metals: saddle points and Landau theory. Phys. Rev. B

**104**, 035142 (2021). https://doi.org/10.1103/PhysRevB.104.035142Y.-P. Lin, R.M. Nandkishore, Complex charge density waves at van hove singularity on hexagonal lattices: Haldane-model phase diagram and potential realization in the kagome metals \(a{V}_{3}{\rm sb }_{5}\) (\(a\)=k, rb, cs). Phys. Rev. B

**104**, 045122 (2021). https://doi.org/10.1103/PhysRevB.104.045122X. Feng, K. Jiang, Z. Wang, J. Hu, Chiral flux phase in the Kagome superconductor AV3Sb5. Sci. Bull.

**66**, 1384–1388 (2021). https://doi.org/10.1016/j.scib.2021.04.043T. Mertz, P. Wunderlich, S. Bhattacharyya, F. Ferrari, R. Valentí, Statistical learning of engineered topological phases in the kagome superlattice of AV\(_3\)Sb\(_5\). npj Comput. Mater.

**8**(1), 1–6 (2022)Y. Hu, X. Wu, B.R. Ortiz, S. Ju, X. Han, J. Ma, N.C. Plumb, M. Radovic, R. Thomale, S.D. Wilson, A.P. Schnyder, M. Shi, Rich nature of van hove singularities in kagome superconductor CsV\(_3\)Sb\(_5\). Nat. Commun.

**13**(1), 2220 (2022). https://doi.org/10.1038/s41467-022-29828-xYu. Zhang, J. Lee, W.-L. Wang, D.-X. Yao, Two-dimensional octagon-structure monolayer of nitrogen group elements and the related nano-structures. Comput. Mater. Sci.

**110**, 109–114 (2015)P. Vijay Gaikwad, A. Kshirsagar, Octagonal family of monolayers, bulk and nanotubes. arXiv:2003.00158 (2020)

W. Li, M. Guo, G. Zhang, Y.-W. Zhang, Gapless MoS\(_2\) allotrope possessing both massless Dirac and heavy fermions. Phys. Rev. B

**89**, 205402 (2014). https://doi.org/10.1103/PhysRevB.89.205402M.A. Springer, T.-J. Liu, A. Kuc, T. Heine, Topological two-dimensional polymers. Chem. Soc. Rev.

**49**, 2007–2019 (2020). https://doi.org/10.1039/C9CS00893DT.-J. Liu, M.A. Springer, N. Heinsdorf, A. Kuc, R. Valentí, T. Heine, Semimetallic square–octagon two-dimensional polymer with high mobility. Phys. Rev. B

**104**, 205419 (2021). https://doi.org/10.1103/PhysRevB.104.205419A.N. Enyashin, A.L. Ivanovskii, Graphene allotropes. Physica Status Solidi (B)

**248**(8), 1879–1883 (2011). https://doi.org/10.1002/pssb.201046583Yu. Liu, G. Wang, Q. Huang, L. Guo, X. Chen, Structural and electronic properties of T graphene: a two-dimensional carbon allotrope with tetrarings. Phys. Rev. Lett.

**108**, 225505 (2012). https://doi.org/10.1103/PhysRevLett.108.225505X.-L. Sheng, H.-J. Cui, F. Ye, Q.-B. Yan, Q.-R. Zheng, S. Gang, Octagraphene as a versatile carbon atomic sheet for novel nanotubes, unconventional fullerenes, and hydrogen storage. J. Appl. Phys.

**112**(7), 074315 (2012)A.I. Podlivaev, L.A. Openov, Kinetic stability of octagraphene. Phys. Solid State

**55**(12), 2592–2595 (2013)G. Qinyan, D. Xing, J. Sun, Superconducting single-layer T-graphene and novel synthesis routes. Chin. Phys. Lett.

**36**(9), 097401 (2019)J. Luo, H.T. Zhu, F. Zhang, J.K. Liang, G.H. Rao, J.B. Li, Z.M. Du, Spin-glasslike behavior of k+-containing \(\alpha\)-mno\(_2\) nanotubes. J. Appl. Phys.

**105**(9), 093925 (2009). https://doi.org/10.1063/1.3117495Y. Crespo, N. Seriani, Electronic and magnetic properties of \(\alpha\)-mno\({}_{2}\) from ab initio calculations. Phys. Rev. B

**88**, 144428 (2013). https://doi.org/10.1103/PhysRevB.88.144428Y. Crespo, A. Andreanov, N. Seriani, Competing antiferromagnetic and spin-glass phases in a hollandite structure. Phys. Rev. B

**88**, 014202 (2013). https://doi.org/10.1103/PhysRevB.88.014202S. Mandal, A. Andreanov, Y. Crespo, N. Seriani, Incommensurate helical spin ground states on the hollandite lattice. Phys. Rev. B

**90**, 104420 (2014). https://doi.org/10.1103/PhysRevB.90.104420S. Liu, A.R. Akbashev, X. Yang, X. Liu, W. Li, L. Zhao, X. Li, A. Couzis, M.-G. Han, Y. Zhu, L. Krusin-Elbaum, J. Li, L. Huang, S.J.L. Billinge, J.E. Spanier, S. O’Brien, Hollandites as a new class of multiferroics. Sci. Rep.

**4**(1), 6203 (2014). https://doi.org/10.1038/srep06203A. Maity, S. Mandal, Quantum theory of spin waves for helical ground states in a hollandite lattice. J. Phys. Condens. Matter

**30**(48), 485803 (2018). https://doi.org/10.1088/1361-648X/aae9bcS. Taniguchi, T. Nishikawa, Y. Yasui, Y. Kobayashi, M. Sato, T. Nishioka, M. Kontani, K. Sano, Spin gap behavior of s=1/2 quasi-two-dimensional system CaV\(_4\)O\(_9\). J. Phys. Soc. Jpn.

**64**(8), 2758–2761 (1995). https://doi.org/10.1143/JPSJ.64.2758N. Katoh, M. Imada, Spin gap in two-dimensional Heisenberg model for CaV\(_4\)O\(_9\). J. Phys. Soc. Jpn.

**64**(11), 4105–4108 (1995). https://doi.org/10.1143/JPSJ.64.4105K. Kodama, H. Harashina, H. Sasaki, Y. Kobayashi, M. Kasai, S. Taniguchi, Y. Yasui, M. Sato, K. Kakurai, T. Mori, M. Nishi, Study of spin-gap formation in quasi-two-dimensional§= 1/2 system CaV\(_4\)O\(_9\): Neutron scattering and NMR. J. Phys. Soc. Jpn.

**66**(3), 793–802 (1997). https://doi.org/10.1143/JPSJ.66.793M. Albrecht, F. Mila, D. Poilblanc, Presence of midgap states in \({\rm cav}_{4}{{\rm o}}_{9}\). Phys. Rev. B

**54**, 15856–15859 (1996). https://doi.org/10.1103/PhysRevB.54.15856M. Troyer, H. Kontani, K. Ueda, Phase diagram of depleted Heisenberg model for ca\({\rm v}_{4}{{\rm o}}_{9}\). Phys. Rev. Lett.

**76**, 3822–3825 (1996). https://doi.org/10.1103/PhysRevLett.76.3822K. Ueda, H. Kontani, M. Sigrist, P.A. Lee, Plaquette resonating-valence-bond ground state of ca\({{\rm v}}_{4}\)\({{\rm o}}_{9}\). Phys. Rev. Lett.

**76**, 1932–1935 (1996). https://doi.org/10.1103/PhysRevLett.76.1932O.A. Starykh, M.E. Zhitomirsky, D.I. Khomskii, R.R.P. Singh, K. Ueda, Origin of spin gap in \({{\rm cav}}_{4}\)\({O}_{9}\): effects of frustration and lattice distortions. Phys. Rev. Lett.

**77**, 2558–2561 (1996). https://doi.org/10.1103/PhysRevLett.77.2558S. Sachdev, N. Read, Spin-Peierls states of quantum antiferromagnets on the \({\rm cav}_{4}{O}_{9}\) lattice. Phys. Rev. Lett.

**77**, 4800–4803 (1996). https://doi.org/10.1103/PhysRevLett.77.4800Z. Weihong, M.P. Gelfand, R.R.P. Singh, J. Oitmaa, C.J. Hamer, Heisenberg models for \({{\rm cav}}_{4}{{\rm o}}_{9}\): expansions about high-temperature, plaquette, ising, and dimer limits. Phys. Rev. B

**55**, 11377–11390 (1997). https://doi.org/10.1103/PhysRevB.55.11377A. Bao, H.-S. Tao, H.-D. Liu, X.Z. Zhang, W.-M. Liu, Quantum magnetic phase transition in square–octagon lattice. Sci. Rep.

**4**(1), 1–7 (2014)S.A. Owerre, Two-dimensional Dirac nodal loop magnons in collinear antiferromagnets. J. Phys. Condens. Matter

**30**(28), 28LT01 (2018). https://doi.org/10.1088/1361-648X/aac8b5M. Deb, A.K. Ghosh, Magnetic field induced topological nodal-lines in triplet excitations of frustrated antiferromagnet CaV\(_4\)O\(_9\). Eur. Phys. J. B

**93**(8), 145 (2020). https://doi.org/10.1140/epjb/e2020-10236-9A. Maity, Y. Iqbal, S. Mandal, Competing orders in a frustrated Heisenberg model on the fisher lattice. Phys. Rev. B

**102**, 224404 (2020). https://doi.org/10.1103/PhysRevB.102.224404M. Kargarian, G.A. Fiete, Topological phases and phase transitions on the square–octagon lattice. Phys. Rev. B

**82**, 085106 (2010). https://doi.org/10.1103/PhysRevB.82.085106B. Pal, Nontrivial topological flat bands in a diamond–octagon lattice geometry. Phys. Rev. B

**98**, 245116 (2018). https://doi.org/10.1103/PhysRevB.98.245116X.-P. Liu, W.-C. Chen, Y.-F. Wang, C.-D. Gong, Topological quantum phase transitions on the kagomé and square–octagon lattices. J. Phys. Condens. Matter

**25**(30), 305602 (2013)A. Sil, A.K. Ghosh, Emergence of photo-induced multiple topological phases on square–octagon lattice. J. Phys. Condens. Matter

**31**(24), 245601 (2019)Y. Yang, X. Li, Topological phase transitions on the square–octagon lattice with next-nearest-neighbor hopping. Eur. Phys. J. B

**92**(12), 1–5 (2019)Y.-T. Kang, L. Chen, F. Yang, D.-X. Yao, Single-orbital realization of high-temperature \({s}^{\pm {}}\) superconductivity in the square–octagon lattice. Phys. Rev. B

**99**, 184506 (2019). https://doi.org/10.1103/PhysRevB.99.184506L.H.C.M. Nunes, C.M. Smith, Flat-band superconductivity for tight-binding electrons on a square–octagon lattice. Phys. Rev. B

**101**, 224514 (2020). https://doi.org/10.1103/PhysRevB.101.224514T. Mertz, R. Valentí, Engineering topological phases guided by statistical and machine learning methods. Phys. Rev. Res.

**3**, 013132 (2021). https://doi.org/10.1103/PhysRevResearch.3.013132T. Mertz,

*Understanding Topological Phases of Matter with Statistical Methods*. Doctoral thesis. Universitätsbibliothek Johann Christian Senckenberg (2022)M.V. Berry, Quantal phase factors accompanying adiabatic changes. Proc. R. Soc. Lond. A Math. Phys. Sci.

**392**(1802), 45–57 (1984). https://doi.org/10.1098/rspa.1984.0023F. Wilczek, A. Zee, Appearance of gauge structure in simple dynamical systems. Phys. Rev. Lett.

**52**, 2111–2114 (1984). https://doi.org/10.1103/PhysRevLett.52.2111T. Fukui, Y. Hatsugai, H. Suzuki, Chern numbers in discretized Brillouin zone: efficient method of computing (spin) hall conductances. J. Phys. Soc. Jpn.

**74**(6), 1674–1677 (2005). https://doi.org/10.1143/JPSJ.74.1674A. Bhattacharyya, On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math. Soc.

**35**, 99–109 (1943)L. Pardo,

*Statistical Inference Based on Divergence Measures*, 1st edn. (Chapman and Hall/CRC, New York, 2005). https://doi.org/10.1201/9781420034813

## Acknowledgements

We would like to thank Thomas Mertz for valuable inputs and discussions. We acknowledge support from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) through TRR 288-422213477 (Project B05) and through QUAST FOR 5249-449872909 (Project P4). We thank the Open Access Publication Fund of Goethe Universität Frankfurt am Main for financially supporting the open access publication of this article.

## Funding

Open Access funding enabled and organized by Projekt DEAL.

## Author information

### Authors and Affiliations

### Corresponding author

## Ethics declarations

### Conflict of interest

The authors declare that they have no conflict of interest.

## Additional information

Focus Point on Machine Learning for Materials Physics: From Pitfalls to Best Practices.

Guest Editors: D. Di Sante, A. M. Sengupta.

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Wunderlich, P., Ferrari, F. & Valentí, R. Detecting topological phases in the square–octagon lattice with statistical methods.
*Eur. Phys. J. Plus* **138**, 336 (2023). https://doi.org/10.1140/epjp/s13360-023-03937-y

Received:

Accepted:

Published:

DOI: https://doi.org/10.1140/epjp/s13360-023-03937-y