On zero-error codes produced by greedy algorithms

We present two greedy algorithms that determine zero-error codes and lower bounds on the zero-error capacity. These algorithms have many advantages, e.g., they do not store a whole product graph in a computer memory and they use the so-called distributions in all dimensions to get better approximations of the zero-error capacity. We also show an additional application of our algorithms.


Introduction
A discrete channel W : X → Y (or simply W ) is defined as a stochastic matrix 1 whose rows are indexed by the elements of a finite input set X while the columns are indexed by a finite output set Y. The (x, y)th entry is the probability W (y|x) that y is received when x is transmitted. A sequence of channels {W n : X n → Y n } ∞ n=1 , where W n : X n → Y n is the nth direct power of W , i.e., W n (y 1 y 2 . . . y n |x 1 x 2 . . . x n ) = n i=1 W (y i |x i ) and X n is the nth Cartesian power of X , is called a discrete memoryless channel (DMC) with stochastic matrix W and is denoted by {W : X → Y} or simply {W }. See Shannon (1956), Csiszár and Körner (2011), Körner and Orlitsky (1998), Cover and Thomas (2006) and McEliece (2004) for more details.
Let W : X → Y be a discrete channel. We define the ω-characteristic graph G of W as follows. Its vertex set is V (G) = X and its set of edges E(G) consists of input pairs that cannot result in the same output, namely, pairs of orthogonal rows of the matrix W . We define α-characteristic graph G(W ) (we call it characteristic graph for short) of W as the complement of the ω-characteristic graph of W . Let {W : X → Y} be a DMC and so W : X → Y is the corresponding discrete channel. We define the characteristic graph G({W }) of the discrete memoryless channel {W } as {G(W n )} ∞ n=1 . The Shannon (zero-error) capacity C 0 (W ) of the DMC {W : X → Y} is defined as C(G(W )), where See Csiszár and Körner (2011), Körner and Orlitsky (1998), Cover and Thomas (2006) and McEliece (2004) for more details. Let G be the characteristic graph of W and (G) = sup n∈N n α(G n ). Then (G) uniquely determines C 0 (W ). Let W : X → Y be a discrete channel with the characteristic graph G. A sequence of input letters is called an input word. Input words x 1 x 2 . . . x n ∈ X n and x 1 x 2 . . . x n ∈ X n are orthogonal if the vectors W n (·|x 1 x 2 . . . x n ) and W n (·|x 1 x 2 . . . x n ) are orthogonal. A zero-error code of block length n for a DMC is defined by a set of mutually orthogonal input words (Körner and Orlitsky 1998;Cover and Thomas 2006). Furthermore, an independent set I of the characteristic graph G(W n ) corresponds to the zero-error code for W n and G(W n ) is the same as G n (Shannon 1956;Körner and Orlitsky 1998).
The research on zero-error codes was initiated by Shannon in 1956. He found capacities of a class of channels (graphs) that does not yield additional information benefits (Shannon 1956) and he provided a method which enables constructing codes for these channels. The research was continued by, among others, Lovász (1979) in his IEEE Information Theory Society award work, in which he determined the values of Shannon capacities for some channels with effective codes using the socalled Lovász function. The class of channels examined by Lovász is represented by the so-called vertex-transitive, self-complementary graphs. It is the only one class containing channels with effective codes, for which the an explicit formula of the Shannon capacity is known.
Recently, Polak and Schrijver (2019) and Mathew and Östergård (2017) made some progress in research on channels represented by strong powers of cycles. Moreover, Boche and Deppe (2020) proved that the zero-error capacity is uncomputable in the Banach-Mazur and Borel-Turing senses. Earlier, Alon and Lubetzky (2006) showed that the series of independence numbers of strong powers of a fixed graph can exhibit a complex and unpredictable structure. In this article, we propose polynomial algorithms that approximate the capacity for some channels.
In the next section, we describe the so-called fractional independence number defined by Rosenfeld (1967), which is strongly related to the considered problem.

Fractional independence number
Computing the independence number of a graph G = (V , E) can be formulated by the following integer program.
where S = {0, 1}. Now let S = [0, 1]. Given a graph G, by α * 2 (G) we denote the optimum of the objective function in the integer program (1). However, for a graph G and a set of not necessarily all its cliques 2 C by α * C (G) we denote the optimum of the objective function in the following integer program.
If C is the set of all maximal cliques of size at most r in G, then we denote α * C (G) by α * r (G). If C contains the set of all cliques (or equivalently all maximal cliques) of G, then we denote α * C (G) by α * (G) and it is called the fractional independence number of G. It is worth to note that α * is multiplicative with respect to the strong product (Scheinerman and Ullman 2011).
The following results present some properties of the linear program (2). In particular, the first observation establishes an order between the above-mentioned measures.

Lemma 1 For every graph G and a non-empty set of its cliques C we have
where ς(C) = min{ C∈C |{v} ∩ C| : v ∈ C∈C C} and R C (G) = |V (G)| − | C∈C C|. Furthermore, the equalities hold in the inequality chain (3) if G is vertextransitive 3 and C is the set of all largest cliques in G.
Proof It is well known (Gross et al. 2014) that for every graph G we have From (4) and Observation 1, the left inequality holds in (3). Given a linear program (2) and its optimum α * If G is vertex-transitive and C is the set of all largest cliques in G, then C covers the whole vertex set, i.e., V (G) = C∈C C. Hence R C (G) = 0. Furthermore, every vertex is contained in the same number of largest cliques. Hence ς(C)|V (G)| = ω(G)|C|.
It is interesting that the measure α * has a particular interpretation in information theory (Shannon 1956;Körner and Orlitsky 1998).

Capacity approximation
It is well known (Shannon 1956) that 4 Hales (1973) showed that for arbitrary graphs G and H we have In contrast to the above results, in the next section we use the fractional independence number to calculate lower bounds on the Shannon capacity and the independence number of strong products. A function β : G → R is supermultiplicative (resp. submultiplicative) on G with respect to the operation •, if for any two graphs G 1 , . A supermultiplicative and submultiplicative function is called multiplicative. The independence number α is supermultiplicative on the set of all graphs with respect to the strong product, i.e., α(G H ) ≥ α(G) · α(H ) for any graphs G and H . Let B be a lower bound on the independence number α, i.e., α(G) ≥ B(G). If B(G i ) > (α(G)) i (i ≥ 2), then G is of type II and is more interesting from an information theory point of view (Shannon 1956). It is possible if B(G i ) > (B(G)) i . Thus we require that B has the last two properties for at least one graph, i.e., B recognizes some graphs of type II.
The residue R of a graph G of degree sequence S : d 1 ≥ d 2 ≥ d 3 · · · ≥ d n is the number of zeros obtained by the iterative process consisting of deleting the first term d 1 of S, subtracting 1 from the d 1 following ones, and re-sorting the new sequence in non-increasing order (Favaron et al. 1993). It is well known (Favaron et al. 1991) that α(G) ≥ R(G). Unfortunately, the following negative result holds.

Proposition 1 Let G and H be regular 5 or split graphs. Then R(G H ) ≤ R(G) · R(H ).
Proof Let G and H be regular graphs. For a regular graph G, from Favaron et al. (1991) is the degree of each vertex of G. From Jurkiewicz (2017) we know that the ceiling function is submultiplicative on non-negative real numbers with respect to the multi-

plication. Hence R(G H ) = |V (G)||V (H )|/(1+(d(G)d(H )+d(G)+d(H ))) ≤ |V (G)|/(1 + d(G)) |V (H )|/(1 + d(H )) = R(G) · R(H ), since a strong product of regular graphs is regular.
Let G and H be split graphs. From Barrus (2012) and Hammack et al. (2011) We conjecture that the residue is submultiplicative on the set of all graphs with respect to the strong product. This probably means that the residue does not recognize any graphs of type II. There are more such bounds, e.g., the average distance (Jurkiewicz 2017), the Caro-Wei bound and the Wilf bound (Jurkiewicz and Pikies 2015). On the other hand, it is hard to find bounds that recognize at least one graph of type II.

Greedy algorithm MIN
In this section, we analyze, in the context of DMCs codes, the so-called greedy algorithm Min (Algorithm 5.1) that determines an independent set and a lower bound on the independence number of a graph (Harant and Schiermeyer 2001). The algorithm Min has complexity O(n 2 ). Similar greedy algorithms can be found in literature (Borowiecki and Rautenbach 2015).

Algorithm 5.1 Greedy Algorithm Min
return I A greedy algorithm always makes the choice that looks best at the moment. That is, it makes a locally optimal choice in the hope that this choice will lead to a globally optimal solution (Cormen et al. 2009). Vertices chosen (in such a way) by Min often strongly block an eventual choice of vertices in a further stage of the algorithm, making generated independent sets are small, especially for strong products of graphs of type II. In Table 1, we summarize results produced by Min for these graphs. On the other hand, Min works well for strong products of investigated graphs of type I. Although channels represented by graphs of type I do not yield additional information benefits, we also need a fast method that determines zero-error codes for these channels. There are at least two ways to do this, i.e., we can run Min on the characteristic graph G of a channel (since I n is an independent set of G n if I is an independent set of G) or directly on a strong power of G. In Table 2, we summarize our results produced by Table 1 For each graph G ∈ G + n,2 = {H : α(H 2 ) > α 2 (H )∧|V (H )| = n} we determined T = 10 (11−n) independent sets of the graph G 2 using the algorithm Min and from these T sets, we chose the largest one, which is denoted by I   The independent set I is a larger set of I × I and I , where I = Min(G) and I = Min(G 2 ). In addition, we obtained |I | = α(G 2 ) for all graphs on n = 1, 2, 3, 4 Min for some graphs of type I. It is important to note that, for all results, we randomly 6 chose vertices with the smallest degrees in Min (in line 4).

Modification of greedy algorithm MIN
In this section, we present a new greedy algorithm that produces an independent set (a DMC code) and a lower bound on the independence number of a strong product. This value, from (5), also determines a lower bound on the Shannon capacity. We try to improve Min, since from our research it follows that it does not work well, i.e., it recognizes a small number of graphs of type II. Our goal is to get larger independent sets for strong products of graphs of type II by a modification of the mentioned algorithm. We begin by introducing definitions required in the rest of the paper.
A semigroup is a set S with an associative binary operation on S. A semiring is defined as an algebra (S, +, ·) such that (S, +) and (S, ·) are semigroups and for any (Adhikari and Adhikari 2014). Note that (G, ∪, ) is a semiring, where G is the set of all finite graphs. In addition, ∪ and are commutative operations with neutral elements (∅, ∅) and K 1 , respectively.
Lemma 2 Let p, r be positive integers and G 1 , G 2 , . . . , G r be graphs. Then where summations extend over all ordered sequences ( p 1 , p 2 , . . . , p r ) of nonnegative integers that sum to p.
Proof The first part of the theorem can be proved in analogous way to the one in (Loehr 2011, Theorem 2.12) for rings (we only need the above mentioned properties of the semiring (G, ∪, )).
The second part of the theorem follows from the fact that the independence number is multiplicative with respect to the disjoint union ∪ for all graphs.
The considered modification of the greedy algorithm takes as input arbitrary graphs G 1 , G 2 , . . . , G r and produces as output an independent set of G = G 1 G 2 . . . G r . From Lemma 2, we can find connected components of G . Hence, our greedy algorithm can be applied to each connected component of G separately, or to the entire graph G at once. We prefer the first method.
The next step of our modification is a reduction of factors of a strong product. For each i ∈ {1, 2, . . . , r } and any (Jurkiewicz 2020). Let G be a factor of a strong product G , for example G = G i . Let > be a strict total order on V (G). We reduce the factor G by Algorithm 6.1 (Reduction GR), which has complexity O(Δ 2 m). This algorithm is correct (in the considered context) since we remove vertices from the strong product, and hence we can only decrease or leave unchanged its independence number.
For some graphs, which we take as input, e.g., for a path on n ≥ 6 vertices, we need to recursively repeat (at most n times) the algorithm GR to get a smaller graph. Sometimes, the algorithm GR produces vertices with degree zero. Such vertices should be removed from a graph, but taken into account in the outcome.
Let G be a graph and k be a positive integer. Let A be a k-tuple of subsets of V (G). By B G (A) we denote a sequence containing upper bounds on α(G[A i ]) for Algorithm 6.1 Reduction GR 1: function GR(G) 2: Thus, if Q = {Q 1 , Q 2 , . . . , Q k } (k ∈ N + ) is a set of cliques of G i and where α i occurs k times. Let The function α * is multiplicative with respect to the strong product for all graphs (Scheinerman and Ullman 2011). Thus, from Observation 1 and (5) we get and finally α i holds the condition (6). Furthermore, from (7) and (3), for graphs without vertices with degree zero, also the following substitution holds the condition (6). Let i ∈ {1, 2, . . . , r }. Algorithm 6.3 (Distribution Distr), which takes as input a graph G = G i and an upper bound α b = α i , determines a distribution for a graph G . The algorithm Distr, whose running time is O(n 2 ), uses Algorithm 6.2 (Greedy Coloring GC), which has complexity O(n + m) (Kubale 2004). The algorithm GC takes as input a graph G and an arbitrary permutation P of the vertex set of G. GC in Distr legally colors the complement of G and hence produces a partition Q of the vertex set of G into cliques (the so-called clique cover of G). Subsequently, Distr distributes α b = α i potential elements of an independent set roughly evenly (about α b /|Q| elements or less depending on the sum from line 19) among all vertices of Q (as well as among all subgraphs of Algorithm 6.2 Greedy Coloring GC 1: function GC(G, P) comment: In all algorithms, loops contained the keyword in are performed in a given order.

2:
for each v in P do 3: assign to v the smallest possible legal color C(v) in G 4: return C As we mentioned before, vertices chosen by Min strongly block an eventual choice of vertices in a further stage of the algorithm. Our greedy algorithm, i.e., Algorithm 6.4 (Greedy Algorithm Min-SP), significantly diminishes the mentioned effect by the use of generated distributions. The vertex set of G 1 G 2 . . . G r can be interpreted as the r -dimensional cuboid of the size |V (G 1 )| · |V (G 2 )| · . . . · |V (G r )|. Min-SP uses distributions in all r dimensions. Earlier Baumert et al. (1971),  and Codenotti et al. (2003), only one distribution was used at one time in algorithms for the maximum independent set problem in subclasses of the strong product of graphs to reduce a search space. The important point to note here is that in cases that are more interesting from an information theory point of view, i.e., if G 1 = G 2 = · · · = G r , some parts of Min-SP are much simpler, e.g., we can determine one distribution and then we use it in all dimensions.
Min-SP defines four sets N , V , F and I , where N is the closed neighborhood of a chosen vertex v * (line 10), V is a set of vertices that are available for the next iterations, F is a set of forbidden vertices that are not available for the next iterations and I is an actual solution (an actual independent set). In lines 13-14 and lines 18-21, Min-SP updates distributions and degrees of all vertices from V , respectively. In line Algorithm 6.3 Distribution Distr distr v ← 0 4: assign to P vertices of G arranged in non-increasing order according to their degrees 5: C ← GC(G, P) 6: create the clique cover CC of G from the coloring C 7: sort cliques from CC in non-increasing order according to their sizes 8: sort vertices in cliques from CC in non-decreasing order according to their degrees 9: for each Q in CC do 10: for each v in Q do 14: K ← q 15: if i < r then 16: 17, elements of N and F are removed from V , but only degrees of vertices from N are updated. An advantage of Min-SP is that we do not need to store edges of G = G 1 G 2 . . . G r in a computer memory. This is important since |E(G )| almost always fast increases with r . In the memory, we only keep factors of G , and the adjacency relation ∼ is directly checked from the conditions specified in the definition of the strong product (line 20). Sometimes for I ∈ V (G) and a graph G. Thus, finally, it is possible to get a larger independent set of G , i.e., I = I ∪ Min(G − N G [I ]). We prefer such a method in our computations. It turns out that we also do not need to store edges of G if we want to execute Min(G − N G [I ]). It can be done by a modification of Min similar to that we performed, when we constructed Min-SP.
In Table 3, we summarize results produced by Min-SP. The algorithm has a running time of O(|V | 3 ).
We can approximate the Shannon capacity using (5) and the algorithm Min-SP. We show it by the following example.

Another modification with additional conditions and relaxed distributions
In this section, we propose another modification of greedy algorithm Min, which is similar to Min-SP, but in addition, it uses distances between vertices of the strong product. We write down the new polynomial algorithm Min-SP2 (Algorithm 7.1) in a more compact form to make additional elements more visible. The modification, besides the advantages of Min-SP, has better accuracy (Table 4), but sometimes Min-SP gives a larger independent set than Min-SP2.
Algorithm 7.1 Greedy Algorithm Min-SP2 assign to V * the set of elements v ∈ V with the min. d(v) 10: assign to V * the set of elements v ∈ V * with the max. r i=1 distr for each v ∈ I do 14: if v and v differ on exactly one coordinate i then 15: if In contrast to Min-SP, Min-SP2 determines a set V * of all vertices of V with the smallest degree (line 9). The set V is defined exactly the same as in Min-SP. In each iteration, Min-SP2 takes vertices from V * with the largest r i=1 distr (i) v i (i.e., it chooses vertices from "the least crowded region", line 10). We also realized that Table 4 For each graph G ∈ G + n,2 = {H : α(H 2 ) > α 2 (H )∧|V (H )| = n} we determined an independent set I of the graph G 2 using the algorithm Min-SP2, which has better accuracy than Min-SP. By accuracy of these algorithms for examined graphs, we mean the sum of all numbers in the last column of an appropriate  v). It is worth to note here that we always randomly relabel all input graphs at the beginning of our algorithm.
According to the above considerations, we propose a relaxed version of the function Distr, which we call Relaxed Distribution RDistr (Algorithm 7.2).

Algorithm 7.2 Relaxed Distribution RDistr
8 Community detection problems Chalupa and Pospíchal (2014) investigated the growth of large independent sets in the Barabási-Albert model of scale-free complex networks. They formulated recurrent relations describing the cardinality of typical large independent sets and showed that this cardinality seems to scale linearly with network size. Independent sets in social networks represent groups of people, who do not know anybody else within the group.
Hence, an independent set of a network plays a crucial role in community detection problems, since vertices of this set are naturally unlikely to belong to the same community (Chalupa and Pospíchal 2014;Whang et al. 2013). These facts imply that the number of communities in scale-free networks seems to be bounded from below by a linear function of network size (Chalupa and Pospíchal 2014). Leskovec et al. (2010) introduced the Kronecker graph network model that naturally obeys common real network properties. In particular, the model assumes that graphs have loops and corresponds to the strong product (Hammack et al. 2011). Let i ≥ 1 and G = G i . As mentioned earlier, the function α is supermultiplicative and α * is multiplicative with respect to the strong product for all graphs. Thus (α * (G)) i = α * (G ) ≥ α(G ) ≥ (α(G)) i and hence |V (G )| c ≥ α(G ) ≥ |V (G )| c , where 8 c = log(α * (G))/ log(V (G)) and c = log(α(G))/ log(V (G)). We have just showed that the cardinality of maximum independent sets, in the mentioned model, scale sublinearly with network 9 size. Furthermore, if G is of type I, then α(G ) = |V (G )| c . These considerations show that the number of communities in scale-free networks seems to be bounded from below by a sublinear (rather than a linear) function of network size. It is worth pointing out that we can approximate (resp. predict) the number of communities, in the mentioned model (resp. real complex network), using Algorithms 6.4 or 7.1 (Greedy Algorithm Min-SP, Min-SP2).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.