Abstract
It is known that the following five counting problems lead to the same integer sequence \({f_t}({n})\):
-
(1)
the number of nonequivalent compact Huffman codes of length n over an alphabet of t letters,
-
(2)
the number of “nonequivalent” complete rooted t-ary trees (level-greedy trees) with n leaves,
-
(3)
the number of “proper” words (in the sense of Even and Lempel),
-
(4)
the number of bounded degree sequences (in the sense of Komlós, Moser, and Nemetz), and
-
(5)
the number of ways of writing
$$\begin{aligned} 1= \frac{1}{t^{x_1}}+ \dots + \frac{1}{t^{x_n}} \end{aligned}$$with integers \(0 \le x_1 \le x_2 \le \dots \le x_n\).
In this work, we show that one can compute this sequence for all \(n<N\) with essentially one power series division. In total we need at most \(N^{1+\varepsilon }\) additions and multiplications of integers of cN bits (for a positive constant \(c<1\) depending on t only) or \(N^{2+\varepsilon }\) bit operations, respectively, for any \(\varepsilon >0\). This improves an earlier bound by Even and Lempel who needed \({O}({{N^3}})\) operations in the integer ring or \(O({N^4})\) bit operations, respectively.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
1.1 Motivation
The purpose of this paper is to study the complexity of a counting problem, namely determining the number of nonequivalent compact Huffman codes of length n over an alphabet of t letters, and several equivalent combinatorial or number theoretic objects; see below and in particular (1.1) for a precise definition. The fastest algorithm in the published literature is due to Even and Lempel [1] (1972) and has a complexity of \(O({N^3})\) operations in the ring of integers.
When actually computing the number of such compact Huffman codes, we experimentally observed that an approach of evaluating a generating function—this generating function was first studied by Flajolet and Prodinger [2] (1987)—appears to be very fast. A detailed analysis (see Theorem 1) shows that the complexity is indeed only \(O({N^{1+\varepsilon }})\) (for any \(\varepsilon >0\)) additions and multiplications of integers of size cN bits (see (3.1)), where \(c<1\) is a positive constant depending on t only.
In this paper, we will first describe the different but equivalent objects that we count and then present a quite detailed analysis of computing the number of these objects.
1.2 Codes, unit fractions and more
For a fixed integer \(t\ge 2\), Elsholtz, Heuberger and Prodinger [3] studied the number
\(r\ge 0\), i. e. the number of partitions of 1 into nonpositive powers of t. It is known that this counting problem is equivalent to several other counting problems, namely the number of “nonequivalent” complete rooted t-ary trees (also called “level-greedy trees”; see [2,3,4]), the number of “proper words” (in the sense of Even and Lempel [1]), the number of bounded degree sequences (in the sense of Komlós, Moser, and Nemetz [5]), and the number of nonequivalent compact Huffman codesFootnote 1 of length r over an alphabet of t letters. For a detailed survey on the existing results, applications and literature on these sequences; see [3]. As a small concrete example, we note that for \(t=2\), \(r=5\) we have \(f_2(5)=3\), as can be seen from working out the following:
As discussed in [3], \({f_t}({r})\) is positive only when \(r=1+n(t-1)\), so it is more convenient to study \({g_t}({n})=f_t(1+n(t-1))\) instead. For \(t=2\) the values of \({g_t}({n})\) start with
for \(t=3\) with
and the first terms of \({g_4}({n})\) are
These are sequences A002572, A176485 and A176503 in the On-Line Encyclopedia of Integer Sequences [6].
1.3 Asymptotics
It has been proved (see Elsholtz, Heuberger, Prodinger [3]) that for fixed t, the asymptotic growth of these sequences can be described by two main terms and an error term as
where \(1< r_3< \rho _2< \rho < 2\). Here all constants depend on t. In particular, if \(t=2\), then
Moreover, the authors of [3] also show that \(\rho = 2-2^{-t-1}+O({t\,4^{-t}})\) as \(t\rightarrow \infty\).
Beside the enumeration of all these objects, probabilistic questions concerning many different parameters have been studied asymptotically in [4, 7].
1.4 Algorithmic counting
As this family of sequences appears in many different contexts and as the sequences’ growth rates have been studied in detail (see the section above and the introduction of [3] for full details), it is somehow surprising that the current record on the algorithmic complexity of determining the members of the sequence (in the case \(t=2\)) appears to be a 50 years old paper by Even and Lempel [1]. Hence it seemed worthwhile to study this complexity from a new point of view and we thus succeeded to improve the upper bound complexity considerably; see section below.
The algorithm of Even and Lempel [1] produces the sequence \({g_t}({n})\) for \(n<N\). It takes \(O({N^3})\) additions of integers bounded by \(O({\rho ^N})\) (with \(\rho <2\); so integers with roughly N bits in size), which are \(O({N^4})\) bit operations. They only studied the case \(t=2\) in detail, but mention that their result can be generalized to arbitrary t.
1.5 Main result
In this paper we take an entirely new approach to the problem of evaluating \(g_t(n)\). Rather than thinking about an algorithm itself, as Even and Lempel [1] did, we think about how to evaluate the generating function (3.4) of \(g_t(n)\) established in [3] efficiently. As it turns out the cost essentially comes from one division of power series of precisionFootnote 2 N whose coefficients are integers bounded by \(O({\rho ^N})\) (with \(\rho <2\)).
Estimating the cost of this evaluation strategy leads to tremendous improvement—to be precise, by a factor \(N^2\) in both ring operations and bit operations—of the cost of using [1]. It is not obvious that the cost for evaluation of numerator and denominator of the generating function are asymptotically (much) smaller than the total cost; see Theorem 1 for details and also Sect. 5 providing even more details during the proof of this theorem. We in particular show that the cost for evaluating numerator and denominator are asymptotically almost (neglecting logarithmic factors) by a factor N smaller.
Using the multiplication algorithms of Schönhage and Strassen [8], of Fürer [9, 10], or of Harvey and van der Hoeven [11] (see Sect. 2 for an overview) our algorithm leads to \(N (\log N)^2 \, 2^{O({\log ^* N})}\) operations in the integer ring and consequently \(N^2 (\log N)^4 \, 2^{O({\log ^* N})}\) bit operations, where \(\log ^* N\) denotes the iterated logarithm.Footnote 3 In Remark 6.2 a discussion on the memory requirements can be found. An implementation of this algorithm, based on FLINT [12, 13] (which is, for example, included in the SageMath mathematics software [14]) is also available;Footnote 4 see also Appendix A for the relevant lines of code and remarks related to the implementation. In Appendix A, we discuss the running times of this implementation.
The literature describes a number of algorithms constructing the complete list of t-ary Huffman codes of length \(r=1+n(t-1)\); see [15,16,17,18]. There is no performance analysis given. But, as the number of such codes grows exponentially in r it is clear that listing all codes is not a fast method to determine the number of such codes only. The algorithm by Even and Lempel [1] computes the number \(f_2(n)\) without listing all codes, and is to the best of our knowledge the fastest algorithm previously known. Our algorithm relies on calculations involving power series with large integer coefficients.
It should also be emphasized that the output \(f_t(n)\) of the algorithm grows exponentially in n (this was mentioned above), therefore the number of bits to represent \(f_t(n)\) is linear in n whereas the input is only logarithmic in n. The quite general survey paper by Klazar [19] studies classes of problems where the output needs at most a polynomial number of steps, in terms of the combined size of input and output. As we can compute \(f_t(n)\) efficiently, this problem falls into the class considered by Klazar.
1.6 Notes
It should be pointed out that in this article, we derive and compare upper bounds. It might be that the actual cost are smaller. However, as we compute the first N coefficients all at the same time and the coefficients grow exponentially in N, a lower bound for the number of bit operations necessarily contains a factor \(N^2\). Moreover, as multiplication of some sort is involved, lower order factors (growing with N) are expected as well.
We also mention that the following is open: How fast can a single coefficient \({g_t}({n})\) (in contrast to all coefficients with \(n<N\)) be computed?
2 Cost of the underlying operations
In this section, we give a brief overview on the time requirements for performing addition and multiplication of two integers and for performing multiplication and division of power series. The current state of the art is also summarized in Table 1.
2.1 Addition and multiplication
First, assume that we want to perform addition of two numbers bounded by \(2^M\), i.e., numbers with M bits. We have to look at each bit of the numbers exactly once and add those (maybe with a carry). Therefore, we need O(M) bit operations.
Next, we look at multiplication of two numbers bounded by \(2^M\). It is clear that this can be achieved with \(O({M^2})\) operations, but it can be done better. An overview is given in the survey article by Karatsuba [20]. The Karatsuba multiplication algorithm [21, 22] has a complexity of \(O({M^{\log _2 3}})\). A faster generalisation of it is the Toom–Cook-algorithm [23]. Combining Karatsuba multiplication with the Fast Fourier Transform algorithm (see Cooley and Tukey [24]) gives an algorithm with bit complexity \(O({M(\log M)^{(2+\varepsilon )}})\); see [25,26,27,28].
The multiplication algorithm given by Schönhage and Strassen (see [8]) takes \(O({M\log M \log \log M})\) time. It also uses fast Fourier transform. An asymptotically even faster multiplication algorithm is given by Fürer [9, 10]. It has computational complexity \(M\log M\, 2^{O({\log ^* M})}\), where we again denote the iterated logarithm by \(\log ^* M\). Fürer’s algorithm uses complex arithmetic. A related algorithm of the same complexity but using modular arithmetic is due to De, Kurur, Saha and Saptharishi [29, 30].
The asymptotically fastest known multiplication algorithm is due to Harvey and van der Hoeven [11]; it has a computational complexity of \(O({M\log M})\).
2.2 Power series operations
Let us also summarize the complexity of power series computations; for references see the books of Cormen, Leiserson, Rivest and Stein [31] or Knuth [28]. The multiplication can, again, be speeded up by using fast Fourier transform. We can use the algorithms for integer multiplication presented above; see von zur Gathen and Gerhard [32]. Also, the computational complexity can be improved: Given power series with precision N (i.e., the first N terms) over a ring, we can perform multiplication with \(N\log N\, 2^{O({\log ^* N})}\) ring operations using Fürer’s algorithm.
In order to perform division (inversion) of power series with precision N, we can use the Newton–Raphson-method. We need at most \(4{m}({N}) + N\) ring operations, where m(N) denotes the number of operations needed to multiply two power series with precision N; see von zur Gathen and Gerhard [32, Theorem 9.4] for details; the additional summand m(N) in comparison to that theorem comes from the multiplication with the numerator. Therefore, by using Fürer’s algorithm, we can invert/divide with \(N\log N\, 2^{O({\log ^* N})}\) ring operations.
The bit size occuring in the ring operations for a division of power series with precision N and coefficients of bit size M is NM by the remarks after [32, Theorem 9.6]. Therefore and by assuming \(M=O({N})\) for simpler expressions with respect to the logarithms, we end up with
bit operations.
3 Cost for extracting coefficients
Our main result gives the number of operations needed for extracting the coefficients \({g_t}({n})\) for all \(n<N\). It reflects three different aspects: First, we count operations on a high level, for example power series multiplications. (Below we will denote this operation by \(\textbf{M}\).) Second, we count operations in the ring of integers. There, to stick with the example on power series multiplication, the precision of the power series is taken into account, but not the actual size of the integer. Finally and third, we count bit operations, where also the size of the coefficients (which are integers) is taken into account.
Let us make this more precise and start with the high level operations. We denote
-
an addition (or a subtraction) of two power series by \(\textbf{A}\),
-
a multiplication of two power series by \(\textbf{M}\), and
-
a division of two power series by \(\textbf{D}\).
As we compute the first N terms, we may assume that all power series are of precision N.
An overview and summary of the number of ring operations and bit operations of these high level operations is provided in Sect. 2. Clearly, we have to deal with the size of the coefficients. We first note that for \(n<N\) each coefficient \({g_t}({n})\) can be written with \(M {:}{=} \lfloor {\log _2 {g_t}({N})}\rfloor +1\) bits and that by using the asymptotics (1.2) we can bound this by
when N tends to \(\infty\). Here the constant \(\rho <2\) depends on t; see [3] for details on \(\rho\).
Summarizing, all the operations \(\textbf{A}\), \(\textbf{M}\) and \(\textbf{D}\) are performed on power series of precision N with coefficients written by M bits (numbers bounded by \(2^M\)), and the cost (number of bit operations) are stated in Sect. 2. There is one important remark at this point, namely, we will see during our main proof (Sect. 5) that the coefficients appearing in power series additions and multiplications are actually much smaller than coefficients written by M bits; we will take this into account for counting bit operations.
Beside these main power series operations, we additionally denote
-
other power series operations of precision N (for example, memory allocation or writing initial values) by \(\textbf{S}\), and
-
other operations, more precisely operations of numbers with less than \(\log _2 N\) bits (for example additions of indices) by \(\textbf{O}\).
Thus, an operation \(\textbf{O}\) is performed on numbers bounded by N only (in contrast to the bounded-by-\(2^M\)-operations).
With these notions and by collecting operations as formal sums of \(\textbf{A}\), \(\textbf{M}\), \(\textbf{D}\), \(\textbf{S}\) and \(\textbf{O}\), we can write down the precise formulation of our main theorem.
Theorem 1
Calculating the first N terms of \({g_t}({n})\) can be done with
power series operations,
operations in the ring of integers, and with
bit operations.
In order to prove Theorem 1—the complete proof can be found in Sect. 5,—we look at the cost of calculating the first N terms, which is done by extracting coefficients of the power series
with
This generating function (3.4) can be found in Flajolet and Prodinger [2, Theorem 2] for \(t=2\) and in Elsholtz, Heuberger and Prodinger [3, Theorem 6] for general t. It is derived from the equivalent formulation as counting problem on trees, which was mentioned in the introduction.
4 Auxiliary results
When extracting the first N coefficients, we do not need the “full” generating function, i.e., the infinite sums in the numerator and denominator of (3.4) can be truncated to finite sums. The following lemma tells us how many coefficients we need. We use this asymptotic result in our analysis of the algorithm; for the actual computer programme, we can check indices and exponents by a direct computation.
Lemma 4.1
To calculate numerator and denominator of the generating function (3.4) with precision N, we need only summands with
Proof
Because of an additional factor \(q^{[{j}]}\) in each summand of the numerator, it is sufficient that the largest index of the denominator is less than N. Therefore, we will only look at the indices of the denominator.
Consider the summand of the denominator with index j. The lowest index of a non-zero coefficient of the denominator is
where the notation [i] is defined in Equation (3.5). We only need summands with \(\sigma _j < N\). Taking the logarithm yields
As the first logarithm tends to 0 as \(j\rightarrow \infty\) and the second is bounded, the error term O(1) is large enough and the result follows. \(\square\)
While the bit size of the coefficients \({g_t}({n})\) is linear in N, the size of the coefficients of numerator and denominator of (3.4) is much smaller. We make this precise by using the following lemma.
Lemma 4.2
For \(n \le N\), the nth coefficient of
with \(j \le J\) and J of Lemma 4.1 as well as the nth coefficients of numerator and denominator of (3.4) can be written with
bits.
Proof
We start proving the claimed result for (4.1) and postpone handling numerator and denominator of (3.4) to the end of this proof.
Each factor of (4.1) is a geometric series whose coefficients are either 0 or 1 and whose constant coefficient is 1. In particular, these coefficients are nonnegative. Therefore, it suffices to show the result for \(j=J\).
As the coefficients are either 0 or 1, the nth coefficient of the product equals the cardinality of the set
By using the crude estimate \(a_i \le n/[{i}]\), we see that we have at most 2N/[i] choices for \(a_i\) because \(n<N\) and \([{i}]<N\) by construction. Thus we can bound the cardinality of the set above by
We use \(J=\log _tN + O({1})\) of Lemma 4.1 to obtain
from which follows that the nth coefficient of (4.1) is bounded by
The result in terms of bit size follows by taking the logarithm.
Numerator and denominator are sums where J summands are added up (or subtracted). This corresponds to an additional factor J in the bound (4.3) or an additional summand \(\log _2J\) in the formula (4.2), respectively. As \(J=O({\log _tN})\) by Lemma 4.1, this is absorbed by the error term, so the same formula holds. \(\square\)
5 Proof of Theorem 1
We start with an overview of our strategy. For computing the first N coefficients of the generating function H(q) (see (3.4)), we only need the summands of the numerator and the denominator with \(j<J\) according to Lemma 4.1.
First, consider the denominator of H(q). We compute the products
iteratively by expanding the J different terms \(q^{[{i}]} / (1-q^{[{i}]})\) as geometric series and perform power series multiplications. After each multiplication, we accumulate the result by using one power series addition.
We deal with the numerator in the same fashion. However, by performing the computation of numerator and denominator simultaneously, the above products only need to be evaluated once.
Finally, to obtain the first N coefficients of H(q), we need one power series division of numerator and denominator.
Pseudocode for our algorithm is given in Algorithm 1; an efficient implementation using the FLINT library is presented in Appendix A. The actual analysis of this algorithm is done by counting the operations needed, in particular the power series operations, and providing bounds for the bit sizes of the variables.
Let us come to the actual proof.
Proof of Theorem 1
We analyse the code of Algorithm 1; see Appendix A for the details. It starts by initialising variables (memory allocation and initial values) for the power series operations, which contributes \(O({1})\textbf{S}\). Further initialisation is done by \(O({1})\textbf{O}\) operations.
For computing the first N coefficients of H(q) (see (3.4)), we only need the summands of numerator and denominator with \(j<J=\log _tN+O({1})\) according to Lemma 4.1. Speaking in terms of our computer programme, our outer loop needs J passes. We now describe what happens in each of these passes; the final cost needs then to be multiplied by J.
Suppose we are in step j. After some update of auxiliary variables (needing \(O({1})\textbf{O}\) operations), we compute the product
out of the product with factors up to index \(i=j-1\). Expanding \(q^{[{j}]} / (1-q^{[{j}]})\) as geometric series contributes at most \(\textbf{S}\) and performing a power series multiplication contributes \(\textbf{M}\) and additionally one swap \(O({1})\textbf{S}\). For obtaining the number of bit operations, we need estimates of the coefficients appearing in the multiplication. Lemma 4.2 bounds their value by
bits. Therefore each of our power series multiplications \(\textbf{M}\) needs
bit operations by the results of Fürer [9, 10] and Harvey and van der Hoeven [11]; see also Sect. 2.
After each multiplication, we accumulate the results for numerator and denominator by using one power series addition \(\textbf{A}\) for each of the two. For the numerator, we additionally need \(O({1})\textbf{S}\) operations for the multiplication by \(q^{[{j}]}\) performed by shifting. Concerning bit operations, we use the bound of the coefficients for numerator and denominator provided by Lemma 4.2. In terms of bit size, this leads to the number of bits given in (5.1). Therefore a power series addition needs
bit operations.
In total, we end up with
operations to evaluate the outer loop; these operations translate to
operations in the ring of integers and to
bit operations.
We are now ready to collect all costs for proving the first part of Theorem 1. Additionally to the above, we divide the numerator by the denominator and need one power series division \(\textbf{D}\). The clean-up accounts to \(O({1})\textbf{S}\). This yields (3.2).
Using the Newton–Raphson-method and Fürer’s algorithm (see Sect. 2 and Table 1) a power series division \(\textbf{D}\) results in
operations in the ring. Its operandsFootnote 5 have bit size
which results in
bit operations for our computations.
We note that the number of bit operations of a power series operation \(\textbf{S}\) is linear in N as the coefficients are bounded and that \(\textbf{O}\) is an operation on numbers with \(O({\log N})\) bits. The error term includes all these. Collecting all bit operation results gives the upper bound (3.3). \(\square\)
6 Remarks
In this last section, we provide some remarks related to the above proof and coefficient extraction algorithm.
Remark 6.1
In the proof above, we have seen that the cost (bit operations) of the power series division is asymptotically roughly (not taking logarithms and smaller factors into account) a factor N larger than the cost for computing numerator and denominator, and all the overhead cost.
Moreover, only focusing on the computation of numerator and denominator, the costs (again bit operations) for computing these two are asymptotically dominated by power series multiplication, albeit only by roughly (again not taking into account logarithmically smaller factors) a factor \(\log N\) compared to addition and other power series operations.
Note that when only considering operations in the integer ring, then the multiplications performed in the evaluation of numerator and denominator take the asymptotically leading role by \(N (\log N)^2 \, 2^{O({\log ^* N})}\) operations compared to \(N (\log N) \, 2^{O({\log ^* N})}\) ring operations of the power series division.
At the end of this article we make a short remark on the memory requirements for the presented coefficient extraction algorithm.
Remark 6.2
Our algorithm needs O(N) units of memory—a unit stands for the memory requirements of storing a number bounded by \(\rho ^N\)—plus the memory needed for the power series multiplication and division.Footnote 6 The above means that we can bound the memory requirements by \(O({N^2})\) bits.
Notes
A Huffman code over an alphabet of t letters is a prefix-free subset (the set of “code words”) of the set of finite words over this alphabet, i.e., no code word is a prefix of another code word. It is said to be compact if no further code word can be added without violating the prefix-freeness condition. Two compact Huffman codes are considered to be equivalent if the multisets of the lengths of the code words are equal, and one can choose a representative where shorter words are lexicographically smaller than longer words.
We say that a power series H has precision N if we can write it as \(H(q) = \sum _{n=0}^{N-1} h_n q^n + O({q^N})\) with explicit coefficients \(h_n\).
The iterated logarithm (also called log star) gives the number of applications of the logarithm so that the result is at most 1. For example, we can define it recursively by \(\log ^* M = 1+\log ^*(\log M)\) if \(M>1\) and \(\log ^* M = 0\) otherwise.
The code accompanying this article can be downloaded from https://gitlab.com/dakrenn/count-nonequivalent-compact-huffman-codes.
The actual bit size during the division is \(\frac{N(\log N)^2}{2(\log 2 )(\log t)} + O({N\log N})\); see the end of Sect. 2 for details.
We have been unable to find a reference for the memory requirements of, for example, the Schönhage–Strassen-algorithm. It seems that the GNU Multiple Precision Arithmetic Library (GMP) can do this with 12N units of memory; see [33] for a comment of one of its authors.
References
Even, S., Lempel, A.: Generation and enumeration of all solutions of the characteristic sum condition. Inf. Control 21, 476–482 (1972)
Flajolet, P., Prodinger, H.: Level number sequences for trees. Discret. Math. 65(2), 149–156 (1987)
Elsholtz, C., Heuberger, C., Prodinger, H.: The number of Huffman codes, compact trees, and sums of unit fractions. IEEE Trans. Inf. Theory 59, 1065–1075 (2013)
Heuberger, C., Krenn, D., Wagner, S.: (2013) Analysis of Parameters of Trees Corresponding to Huffman Codes and Sums of Unit Fractions. In: Proceedings of the meeting on analytic algorithmics & combinatorics (ANALCO), New Orleans, Louisiana, USA (Philadelphia PA), SIAM, pp. 33–42
Komlós, J., Moser, W., Nemetz, T.: On the asymptotic number of prefix codes. Mitt. Math. Sem. Giess. 165, 35–48 (1984)
The On-Line Encyclopedia of Integer Sequences. http://oeis.org, 2013
Heuberger, C., Krenn, D., Wagner, S.: Canonical trees, compact prefix-free codes and sums of unit fractions: a probabilistic analysis. SIAM J. Discret. Math. 29(3), 1600–1653 (2015)
Schönhage, A., Strassen, V.: Schnelle Multiplikation großer Zahlen. Comput. Arch. Elektron. Rechnen 7, 281–292 (1971)
Fürer, M.: (2007) Faster Integer Multiplication. In: STOC’07—proceedings of the 39th annual ACM symposium on theory of computing, ACM, New York, pp. 57–66
Fürer, M.: Faster integer multiplication. SIAM J. Comput. 39(3), 979–1005 (2009)
Harvey, D., van der Hoeven, J.: Integer multiplication in time \(O(n \log n)\). Ann. Math. 193(2), 563–617 (2021)
Hart, W.B., Johansson, F., Pancratz, S.: FLINT: Fast Library for Number Theory, Version 2.5.2 (2015). http://flintlib.org
Hart, W.B.: Fast library for number theory: an introduction. In: Mathematical Software---ICMS: Lecture Notes in Computer Science, vol. 6327, pp. 88–91. Springer, Berlin (2010)
The SageMath Developers: SageMath Mathematics Software (Version 8.3) (2018). http://www.sagemath.org
Hoffman, D., Johnson, P., Wilson, N.: Generating Huffman sequences. J. Algorithm 54(1), 115–121 (2005)
Khosravifard, M., Esmaeili, M., Saidi, H., Gulliver, T.A.: A tree based algorithm for generating all possible binary compact codes with \(n\) codewords. IEICE Trans. Fundam. 2003(10), 2510–2516 (2003)
Narimani, H., Khosravifard, M.: (2008) The Supertree of the Compact Codes. In: Proceedings international symposium on telecommunications, pp. 649–655
Niyaoui, O., Reda, O.M.: A new algorithm for generating all binary Huffman codes based on path-length extensions. Int. J. Appl. Eng. Res. 11(21), 10618–10623 (2016)
Klazar, M.: What is an Answer? — Remarks, Results and Problems on PIO Formulas in Combinatorial Enumeration, Part I. arXiv:1808.08449 [math.CO]
Karatsuba, A.A.: The complexity of computations. Proc. Steklov Inst. Math. 211, 169–183 (1995)
Karatsuba, A.A., Petrovich, O.Y.: Multiplication of many-digital numbers by automatic computers. Proc. USSR Acad. Sci. 145, 293–294 (1962)
Karatsuba, A.A., Petrovich, O.Y.: Multiplication of many-digital numbers by automatic computers. Dokl. Phys. 7, 595–596 (1963)
Cook, S.A.: On the Minimum Computation Time of Functions. Harvard University, Cambridge (1966)
Cooley, J.W., Wilder, T.J.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19, 297–301 (1965)
Borodin, A., Munro, I.: The Computational Complexity of Algebraic and Numeric Problems. American Elsevier Publishing Co. Inc, New York (1975)
Borwein, J.M., Borwein, P.B., Bailey, D.H.: Ramanujan, modular equations, and approximations to pi, or how to compute one billion digits of pi. Am. Math. Mon. 96, 201–219 (1989)
Oran Brigham, E.: The Fast Fourier Transform. Prentice-Hall, Englewood Cliffs, New Jersey (1974)
Knuth, D.E.: Seminumerical Algorithms, The Art of Computer Programming, 3rd edn. Addison-Wesley, Boston (1998)
De, A., Kurur, P.P., Saha, C., Saptharishi, R.: (2008) Fast Integer Multiplication using Modular Arithmetic. In: STOC’08: proceedings of the fortieth annual ACM symposium on theory of computing, ACM, New York, pp. 499–505
De, A., Kurur, P.P., Saha, C., Saptharishi, R.: Fast integer multiplication using modular arithmetic. SIAM J. Comput. 42(2), 685–699 (2013)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. The MIT Press, Cambridge (2001)
von zur Gathen, J., Gerhard, J.: Modern Computer Algebra, 2nd edn. Cambridge University Press, Cambridge (2003)
Granlund, T.: (2010) Schönhage-Strassen, gmp-discuss mailing list. http://gmplib.org/list-archives/gmp-discuss/2010-August/004277.html
Funding
Open access funding provided by Paris Lodron University of Salzburg.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
C. Elsholtz is supported by the Austrian Science Fund (FWF): W1230 and by Project Arithrand of the Austrian Science Fund (FWF): I 4945-N and of ANR-20-CE91-0006. C. Heuberger and D. Krenn are supported by the Austrian Science Fund (FWF): P28466-N35.
Appendices
Appendix A. Code
Below are the relevant lines of a programme written in C for computing the coefficients \({g_t}({n})\) with \(n<N\). The code can be found at https://gitlab.com/dakrenn/count-nonequivalent-compact-huffman-codes. The programme uses FLINT [12, 13]. Note that we do not use aliasing of input and output arguments in multiplication because providing our own auxiliary polynomial brings tiny performance improvements.
Appendix B. Timing
The table below contains timings (in seconds) for computing the first N coefficients with \(t=2\).
t | N | \(t_{\mathrm {n \& d}}\) | \(t_{\textrm{division}}\) | \(t_{\textrm{total}}\) |
---|---|---|---|---|
2 | 256 | 0.000 | 0.001 | 0.001 |
2 | 512 | 0.001 | 0.005 | 0.006 |
2 | 1024 | 0.002 | 0.014 | 0.016 |
2 | 2048 | 0.005 | 0.045 | 0.050 |
2 | 4096 | 0.013 | 0.126 | 0.139 |
2 | 8192 | 0.029 | 0.562 | 0.591 |
2 | 16384 | 0.067 | 2.225 | 2.292 |
2 | 32768 | 0.251 | 10.709 | 10.960 |
2 | 65536 | 0.617 | 45.259 | 45.877 |
2 | 131072 | 1.189 | 198.995 | 200.184 |
Here, \(t_{\mathrm {n \& d}}\) is the time for generating numerator and denominator, \(t_{\textrm{division}}\) for the one power series division and \(t_{\textrm{total}} = t_{\mathrm {n \& d}} + t_{\textrm{division}}\).
The benchmark was executed on an Intel(R) Xeon(R) CPU E5-2630 v3 at 2.40GHz. The limiting factor for our computations is the memory requirement; it is the reason computing at most \(N=2^{17}=131072\) coefficients.
The timings in the table and the theoretical result of this article fit together; we can see the \(O({N^{2+\varepsilon }})\) running time of the algorithm in our implementation.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Elsholtz, C., Heuberger, C. & Krenn, D. Algorithmic counting of nonequivalent compact Huffman codes. AAECC (2023). https://doi.org/10.1007/s00200-022-00593-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00200-022-00593-0