Heapability, Interactive Particle Systems, Partial Orders: Results and Open Problems
 1 Citations
 7 Mentions
 369 Downloads
Abstract
We outline results and open problems concerning partitioning of integer sequences and partial orders into heapable subsequences (previously defined and established by Byers et al.).
Keywords
Heapable sequences Posets1 Introduction
Suppose \(a_{1},a_{2}, \ldots , a_{n}\) is a sequence of integers. Can one insert the elements of the sequence, successively, as the leaves of a binary tree that satisfies the min heap property? This is possible, for instance, for sequence \(1 3 2 7 6 5 4\) but not for sequence \(5 4 3 2 1.\) Byers et al. [1] (who introduced the notion), called such a sequence heapable. They provided a polynomial time algorithm to recognize heapability (though, interestingly, complete heapability, i.e. heapability on a complete binary tree is NPcomplete).
One can view the notion of heapability as a (parametric) relaxation of the notion of monotonicity. Indeed, heapability of a sequence requires the fact that the smallest element comes first. The next two elements may, however, arive in any order and the constraints on element ordering become progressively looser. The view of heapability as a generalization of monotonicity, connects the study of heapable sequences to the rich theory built in connection with longest increasing subsequence [2].
In [3] we studied the partition of random permutations into heapable sequences. Similar results were obtained independently in [4]. Perhaps the most exciting finding was the scaling of the number of classes in a partition of a random permutation into heapable subsequences, conjectured to scale as \(\phi \cdot \ln (n)\), with \(\phi \) the golden ratio: in Sect. 5 we explain and motivate this conjecture.
This extended abstract continues this line of inquiry. We present some results and outline several open questions related to the problem of extending notions related to heapability from numbers to partial orders. More topics will be mentioned in the conference presentation.
2 Preliminaries
A (binary min)heap is a binary tree, not necessarily complete for the purposes of this paper, such that \(A[parent[x]]\le A[x]\) for every nonroot node x. If instead of binary we require the tree to be kary we get the concept of kary minheap.
A partially ordered set \(P=(X,\prec )\) is called kheapable if there exists some kary tree T whose nodes are in bijection with the elements of X, such that for every nonroot node \(X_{i}\) and parent \(X_{j}\), \(X_{j}\prec X_{i}\) and \(j<i\). In particular a 2heapable partial order will simply be called heapable.
We easily recover the case of permutations, dealt with in [3], as follows: given permutation \(\pi \in S_{n}\), we define partial order \(\prec \) on \(\{1,2,\ldots , n\}\) by \(i\prec j\) iff \(i<j\) and \(\pi [i]<\pi [j]\).
The height of partial order P, denoted by h(P), is the length of the longest chain (totally ordered subset) of P. The width of P is defined as the size of the largest antichain of P. By Dilworth’s Theorem [5], w(P) is equal to the smallest number of elemenst in a partition of P into chains. Finally, the dimension of P is the smallest number r such that the partial order is the intersection of r permutations.
Example 1
Let \(X=\{I_{1},I_{2},\ldots I_{k}\}\) be a finite set of closed intervals on the real line, with the partial order \(I\preceq J\) given by \(end(I)\le start(J)\). By the Gallai theorems for intervals [6], height(P) is equal to the minimal number of points that pierce (i.e. intesect) every interval in P. On the other hand width(P) is equal to the maximum cardinality of a set of intervals with nonempty joint intersection.
We give a parametric generalization of height(P) and width(P) as follows:
Definition 1
Given an integer \(k\ge 1\), a subset \(Q\subset P\) is a kchain if nodes of Q are the vertices of a kary \(\preceq \)ordered subtree of P (not necessarily induced).
The kheight of P is defined to be the size of the largest kary chain of P. The kwidth of P is defined as the minimal number of classes in a partition of P into kchains.
3 The Computational Complexity of Generalized Height and Width
Open Problem 1

[GIVEN:] Partial order \(P=(X,\prec )\) and integer \(r\ge 1\).

[TO DECIDE:] Can X be partioned into at most r kchains? That is, is inequality k\(w(P)\le r\) true?
Even the case \(k=1\) (a.k.a. the longest heapable subsequence of a random permutation) is still open [1]. In contrast, the kwidth of a finite partial order can be computed in polynomial time:
Theorem 1
For every fixed \(k\ge 1\) there is a polynomial time algorithm that, given finite partial order \(P=(X,\preceq )\) as input, computes the value kw(P).
Proof
Define the following boolean integer programming problem: define a variable \(X_{p,q}\) for every pair \(p\prec q\in P\). Intuitively \(X_{p,q}=1\) if p is the parent of q in the kchain decomposition of P, 0 otherwise.
Every integral solution to this system correponds to a decomposition of P into kary trees: indeed, every node has at most one parent in the decomposition induced by variables \(X_{p,q}=1\), and at most k children.
Since in each tree the number of edges is one less than the number of vertices, in any decomposition of P into kchains, the number of such chains is \(n\sum \limits _{p\prec q} X_{p,q}\).
Remark 1
The argument above owes much to a discussion with János Balogh from Szeged: we told him a restricted version of the problem, that of scheduling intervals on binary trees. This amounts to the setting of Example 1. At the time we had a direct (somewhat complicated) proof of this special case. He came up with a (different but related) argument, using network flows. Subsequently we came with this third proof for the general setting, obviously related to his.
Both our original argument and his extend to the general case, and will be jointly presented somewhere else. In retrospect, the fact that there are several distinct proofs is not surprising: Theorem 1 is obviously related to Dilworth’s Theorem, and the three existing proofs (direct, using network flows, using linear programming) can be seen as extensions of the corresponding arguments for proving this latter result.
4 The Asymptotic Behavior of the Average kheight and kwidth
The problem of computing the 1width of a random partial order of dimension 2 is a variant of the classical problem of computing the longest increasing subsequence of a random permutation. The correct asymptotic behavior is \(2\sqrt{n}\), [10, 11, 12, 13] and substantially more is known.
The (1)width and (1)height of a partial order have also been studied in other dimensions: notable partial results are due to Winkler [8], who showed that the correct order of magnitude for the height of a partial order of dimension k is \(\varTheta (n^{1/k})\). Further results were obtained by Brightwell [14].
As for the height, the 1height of a ddimensional partial order was considered by Winkler [8], and then determined by Bollobás and Winkler [15] to be approximately \(c_{k}\cdot n^{1/k}\) for some constant \(c_{k}>0\).
In [3] we gave a simple simple lower bound valid for all values of the kwidth(P), where P is a random permutation of width 2. We extend this argument to all dimensions as follows:
Theorem 2
Proof
Open Problem 2
As for the kheight, a result from Byers et al. can be recast as \(h(P)=no(n)\) for almost all \(\pi \in S_{n}\). We easily generalize this result to random ddimensional partial orders as follows:
Theorem 3
For all \(d\ge 2, k\ge 1\) and almost all permutations \(P\in P_{d}(n)\) we have k\(h(P)=no(n)\).
Proof
A straightforward adaptation of the argument of Byers et al. [1]. Rather than with kdimensional permutations, we will work with random points in \([0,1]^{d}\) (model II).
First one shows that w.h.p. kh(P)\( =\varOmega (n)\), using a similar idea to the one in [1]: we consider division of P into subcubes \([0,1/2]^{d}\) and \([1/2,1]^{d}\), respectively. Let \(A_{1}\) be the suborder of P determined by the restriction to the first n / 2 elements and first subcube. W.h.p. \(LHS(A_{1})=\varTheta (n^{1/d}).\) This follows from the result of Bollobás and Winkler [15], together with the result of Bollobás and Brightwell [16], that provides concentration of measure for \(LIS(A_{1})\).
Now we organize the subsequence \(A_{1}\) into a kary tree W with \(\varOmega (n^{1/d})\) leaves and continue to add elements of subsequence \(A_{2}\), correponding to points in the second half; we assume we add elements greedily, in the first possible subheap rooted at a node of \(A_{1}\) on the frontier of W, stopping when we can no longer place a node in the tree. With high probability this happens after adding \(\varOmega (n)\) nodes from \(A_{2}\): to see this we employ the observation that the stopping of the algorithm implies the existence of a decreasing sequence of \(A_{2}\) of size \(\varOmega (n^{1/d})\). We then apply the concentration inequality [16] for \(LDS(A_{2})\).
For the second, rescaled part of the proof, we search for constants \(\alpha , \beta > 0\) such that w.h.p. the subsequence \(B_{1}\), consisting of points among the first \(n^{\alpha }\) ones that belong to the rectangle \([0,n^{\beta }]^{d}\) has w.h.p. kwidth \(\varOmega (n^{1/d+\epsilon })\). For this to happen, we take \(\alpha ,\beta \) so that \(\alpha d\cdot \beta >1/d\). It is always possible to find some positive \(\alpha , \beta \) with this property, e.g. \(\alpha =1\frac{1}{2d^2}, \beta =\frac{1}{2d^3}\). Now subsequence \(B_{2}\) consisting of numbers in the rectangle \([n^{\beta },1]^{d}\) among the last \(nn^{\alpha }\) ones has w.h.p. its LDS of size \(\varTheta (n^{1/d})\). Thus sequence \(B_{2}\) can w.h.p. be placed in its entirety on the tree W. Ther remaining parallelipipeds have o(1) volume, hence a sublinear number of points. The rest of the details are as in [1]. \(\square \)
Let us note that a random ddimensional partial order P can be regarded, by definition, as a subset (thinning) of a \((d1)\)dimensional partial order Q: if \(P_{1},P_{2},\ldots , P_{d}\) are the permutations defining P, simply define Q to be the intersection of \(P_{1},P_{2},\ldots , P_{d1}\). So the previous result can be interpreted as the statement that no constant amount of thinning is enough to reduce the width of a random permutation to sublinear.
5 The Special Case \(d=2\)
In the special case of heapable sequences and random permutations (\(d=2\)) we have better insights on the constants \(c_{k,d}\) from the above open problem:
Conjecture 1
Open Problem 3
Prove this conjecture.
In the next session we sketch some of the experimental and nonrigorous theoretical evidence for this result. The calculations are nonrigorous, “physicslike”, and have yet to be converted to a rigorous argument.
5.1 The Connection with the Multiset Hammersley Process
One of the most rewarding ways to analyze the asymptotic behavior of the LIS of a random permutation is the connection with a model from Nonequilibrium Statistical Physics called the Hammersley process.
The easiest way to describe the Hammersley process is via a sequence of random numbers \(X_{1},X_{2},\ldots , X_{n}\ldots \in (0,1)\) (note that this combinatorial description is good for our purposes; the general Hammersley process assumes a unit intensity Poisson process on the real line).
We interpret \(X_{i}\)’s as particles. At each moment the insertion of a new particle removes (kills) the smallest (if any) particle \(X_{j}\), \(X_{j}>X_{i}\). Intuitively, particles correspond to pile heads in patience sorting, a wellknown algorithm for computing LIS. The piles are nondecreasing, hence putting a new particle on a pile with head \(X_{j}\) “kills” \(X_{j}\). Particles that are the largest at the moment when inserted do not kill any particle but simply start a new pile.
A sequence Y of n random particles corresponds naturally to a random ndimensional permutation. The live particles in the Hammersley process correspond to piles in patience sorting. Therefore LIS(Y) is equal to the number of live particles.
The correspondance between live particles and trees in an optimal decomposition of a random permutation carries on to the framework of heapability as well, with a twist: the multiset generalization of the Hammersley process (defined in [3] and denoted by \(HAD_{k}\)) sees every particle come with a fixed number of k lives. A particle does \(X_{i}\) does not kill outright the smallest particle \(X_{j}>X_{i}\): it simply removes one of its lives.

At stage n the “typical” configuration of the \(HAM_{2}\) process is characterized by n particles holding 0,1 or two lives.

The number of particles holding \(\lambda \) lives, for \(\lambda \in \{0,1,2\}\) is approximately equal to \(d_{\lambda }\cdot n\), for some constants \(0<d_{\lambda }<1\). That is, the global density of particles with \(\lambda \) lives converges asymptotically to \(d_{\lambda }\).

Moreover, particles with \(\lambda \) lives are distributed approximately uniformly at random throughout interval (0, 1), so that the relative densities are valid not only globally, but throughout each bin.
So how does this hydrodynamical limit predict the claimed scaling behavior, \(E[2w(P)]\sim \frac{1+\sqrt{5}}{2}\)?
6 HighDimensional Permutations
Linial has initiated [18], under the slogan of “high dimensional combinatorics”, a multidimensional analog of permutations. A pdimensional permutation of order n is a \(n\times n\times \ldots \times n= [n]^{p+1}\) array of 0/1 values in which each line (obtained by setting p indices to values in [n] and leaving free the remaining coordinate) contains exactly a one. Ordinary permutations correspond to the onedimensional case, whereas twodimensional permutations are essentially latin squares.
Recently, Linal and Simkin [19] have considered notions of monotonicity in highdimensional permutations, proving a highdimensonal analog of the ErdősSzékeres theorem. They studied afterwards the scaling of LIS of a random multidimensional permutation, obtaining the scaling \(E[LIS(\pi )]=\varTheta (n^{p/p+1})\) for a random pdimensional permutation.
Open Problem 4
Study the heapability (2width and 2height) of random highdimensional permutations.
7 Partition into (un)equal Parts: Entropy and Compression
So far we have been interested into the partition of a sequence of numbers into a minimal number of kchains.
One may want, instead, a partition that insists on parts as equal/unequal as possible. Porfilio [4] showed that the problem of dividing a sequence of integers into a number of equal parts is NPcomplete.
One may look for the opposite kind of division, that into mostly unbalanced parts. One way to measure the imbalance is via entropy of the distribution induced on the poset by a partition into kchains. Of course, of all distributions with finite support the uniform distribution has the largest entropy. Minimizing entropy is an objective of recent interest in combinatorial optimization [20, 21, 22, 23, 24, 25, 26].
Open Problem 5
Study the complexity of partitioning a poset P into kchains leading to a distribution of minimal entropy.
The open problem is easily seen to be related to the minimum entropy coloring problem for interval graphs. Chromatic entropy is a natural measure with important applications to coding [20, 27, 28].

for \(k=1,d=2\): compute a longest increasing subsequence \(L_{1}\) of P using patience sorting (or dynamic programming).

for other values of pair (k, d): use instead the Byers et al. algorithm for finding a longest heapable subsequence with \(no(n)\) elements.

remove \(L_{1}\) from P and proceed recursively.
Open Problem 6
Can one give guarantees on the approximation performance of these algorithms?
Finally, the decomposition of permutations into components (e.g. runs) forms the basis of the recent theory of data structures and methods for compressing permutations [29, 30] and partial orders. A question that arose during a conversation with Travis Gagie at CPM’2015, and that we would like to state here as an open question is
Open Problem 7
Is the decomposition of sequences into trees, of the sort employed in computing the 2width of a partial order, relevant to compression as well?
Notes
Acknowledgments
This research has been supported by CNCS IDEI Grant PNIIIDPCE201130981 “Structure and computational difficulty in combinatorial optimization: an interdisciplinary approach”.
References
 1.Byers, J., Heeringa, B., Mitzenmacher, M., Zervas, G.: Heapable sequences and subseqeuences. In: Proceedings of ANALCO, pp. 33–44 (2011)Google Scholar
 2.Romik, D.: The Surprising Mathematics of Longest Increasing Subsequences, vol. 4. Cambridge University Press, New York (2015)zbMATHGoogle Scholar
 3.Istrate, G., Bonchiş, C.: Partition into heapable sequences, heap tableaux and a multiset extension of hammersley’s process. In: Cicalese, F., Porat, E., Vaccaro, U. (eds.) CPM 2015. LNCS, vol. 9133, pp. 261–271. Springer, Heidelberg (2015)CrossRefGoogle Scholar
 4.Porfilio, J.: A combinatorial characterization of heapability. Master’s thesis. Williams College, May 2015. http://library.williams.edu/theses/pdf.php?id=790
 5.Dilworth, R.P.: A decomposition theorem for partially ordered sets. Ann. Math. 51, 161–166 (1950)MathSciNetCrossRefzbMATHGoogle Scholar
 6.Gyárfás, A., Lehel, J.: Covering and coloring problems for relatives of intervals. Discrete Math. 55(2), 167–180 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
 7.Brightwell, G.: Models of random partial orders. In: Surveys in Combinatorics, pp. 53–83 (1993)Google Scholar
 8.Winkler, P.: Random orders. Order 1(4), 317–331 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
 9.Schrijver, A.: Theory of linear and integer programming. Wiley, New York (1998)zbMATHGoogle Scholar
 10.Hammersley, J.M., et al.: A few seedlings of research. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Theory of Statistics (1972)Google Scholar
 11.Logan, B.F., Shepp, L.A.: A variational problem for random Young tableaux. Adv. Math. 26(2), 206–222 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
 12.Vershik, A., Kerov, S.V.: Asymptotics of plancherel measure of symmetrical group and limit form of Young tables. Doklady Akademii Nauk SSSR 233(6), 1024–1027 (1977)MathSciNetzbMATHGoogle Scholar
 13.Aldous, D., Diaconis, P.: Hammersley’s interacting particle process and longest increasing subsequences. Probab. Theory Relat. Fields 103(2), 199–213 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
 14.Brightwell, G.: Random kdimensional orders: Width and number of linear extensions. Order 9(4), 333–342 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
 15.Bollobás, B., Winkler, P.: The longest chain among random points in euclidean space. Proc. Am. Math. Soc. 103(2), 347–353 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
 16.Bollobás, B., Brightwell, G.: The height of a random partial order: Concentration of measure. Ann. Appl. Probab. 2(4), 1009–1018 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
 17.Groeneboom, P.: Hydrodynamical methods for analyzing longest increasing subsequences. J. Comput. Appl. Math. 142(1), 83–105 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
 18.Linial, N., Luria, Z.: An upper bound on the number of highdimensional permutations. Combinatorica 34(4), 471–486 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
 19.Linial, N., Simkin, M.: Monotone subsequences in highdimensional permutations. arXiv preprint arXiv:1602.02719 (2016)
 20.Cardinal, J., Fiorini, S., van Assche, G.: On minimum entropy graph colorings. In: Proceedings of ISIT 2004, The International Symposium on Information Theory, p. 43 (2004)Google Scholar
 21.Halperin, E., Karp, R.: The minimum entropy set cover problem. Theor. Comput. Sci. 348(2–3), 340–350 (2005)MathSciNetzbMATHGoogle Scholar
 22.Cardinal, J., Fiorini, S., Joraet, G.: Tight results on minimum entropy set cover. Algorithmica 51(1), 49–60 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
 23.Cardinal, J., Fiorini, S., Joret, G.: Minimum entropy orientations. Oper. Res. Lett. 36(6), 680–683 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
 24.Cardinal, J., Fiorini, S., Joret, G.: Minimum entropy combinatorial optimization problems. In: AmbosSpies, K., Löwe, B., Merkle, W. (eds.) CiE 2009. LNCS, vol. 5635, pp. 79–88. Springer, Heidelberg (2009)CrossRefGoogle Scholar
 25.Kovačević, M., Stanojević, I., Šenk, V.: On the entropy of couplings. Inf. Comput. 242, 369–382 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
 26.Istrate, G., Bonchiş, C., Dinu, L.P.: The minimum entropy submodular set cover problem. In: Dediu, A.H., Janoušek, J., MartínVide, C., Truthe, B. (eds.) LATA 2016. LNCS, vol. 9618, pp. 295–306. Springer, Heidelberg (2016)CrossRefGoogle Scholar
 27.Alon, N., Orlitsky, A.: Source coding and graph entropies. IEEE Trans. Inform. Theory 42, 1329–1339 (1995). CiteseerMathSciNetCrossRefzbMATHGoogle Scholar
 28.Doshi, V., Shah, D., Médard, M., Effros, M.: Functional compression through graph coloring. IEEE Trans. Inf. Theory 56(8), 3901–3917 (2010)MathSciNetCrossRefGoogle Scholar
 29.Barbay, J., Munro, J.I.: Succinct encoding of permutations : Applications to text indexing. In: Encyclopedia of Algorithms, pp. 915–919. Springer (2008)Google Scholar
 30.Barbay, J., Navarro, G.: On compressing permutations and adaptive sorting. Theoret. Comput. Sci. 513, 109–123 (2013)MathSciNetCrossRefzbMATHGoogle Scholar