Abstract
Approaches which “pool adjacent violators” are very simple and efficient methods to solve isotonic regression problems. We extend this type of algorithms to include non-uniform interval constraints in the complete order case. We prove correctness and linear computational complexity of a resulting approach. We also show that a straightforward implementation in C outperforms more general solvers on a sequence of specifically designed problem instances.
Similar content being viewed by others
Data availability
The online resource pavicDemo.c generates the data which we study in this note.
Notes
To be precise, their result is \(O(n\log U)\) with U no larger than the difference of imposed upper and lower bound. The factor \(\log U\) is typical for integer decision variables, see [8, p. 193].
In Algorithm 1, lines 1 to 8.
We provide the C code and header file as online resources pavic.c and pavic.h. The program pavicDemo.c demonstrates their use. Together with pavicOSQPDemo.c, dykstraDemo.c and gurobi_test.py, it creates the results we present.
References
de Leeuw, J., Hornik, K., Mair, P.: Isotone Optimization in R: Pool-Adjacent-Violators Algorithm (PAVA) and Active Set Methods (2009). https://cran.r-project.org/web/packages/isotone/vignettes/isotone.pdf
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2009)
Hu, X.: Application of the limit of truncated isotonic regression in optimization subject to isotonic and bounding constraints. J. Multivar. Anal. 71, 56–66 (1997)
Németh, A., Németh, S.: How to project onto the monotone nonnegative cone using Pool Adjacent Violators type algorithms. (2012) arXiv:1201.2343
Chen, X., Lin, Q., Sen, B.: On degrees of freedom of projection estimators with applications to multivariate nonparametric regression. (2018) arXiv:1509.01877v4
Luss, R., Rosset, S.: Bounded isotonic regression. Electron. J. Stat. 11, 4488–4514 (2017). https://doi.org/10.1214/17-EJS1365
Ahuja, R.K., Orlin, J.B.: A fast scaling algorithm for minimizing separable convex functions subject to chain constraints. Oper. Res. 49, 784–789 (2001)
Hochbaum, D., Queyranne, M.: Minimizing a convex cost closure set. SIAM J. Discrete Math. 16(2), 192–207 (2003)
Borwein, J., Lewis, A.: Convex Analysis and Nonlinear Optimization. Springer, New York (1999)
Welford, B.P.: Note on a method for calculating corrected sums of squares and products. Technometrics 4(3), 419–420 (1962)
Boyle, J.P., Dykstra, R.L.: A method for finding projections onto the intersection of convex sets in Hilbert spaces. Lect. Notes Stat. 37, 28–47 (1986)
Gurobi Optimization, LLC: Gurobi Optimizer Reference Manual (2022). https://www.gurobi.com
Stellato, B., Banjac, G., Goulart, P., Bemporad, A., Boyd, S.: OSQP: an operator splitting solver for quadratic programs. Math. Program. Comput. 12, 637–672 (2020). https://doi.org/10.1007/s12532-020-00179-2
Acknowledgements
We are grateful to Richard Cook, Mark Hogan, Dinesh Mehta, Bruce Richardson, Axel Tenbusch and Carlos Ugaz at NielsenIQ for their encouragement and support. We are particularly indebted to Ludo Daemen at NielsenIQ and to two anonymous reviewers. Their constructive and comprehensive criticism substantially improved our exposition.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We are both employees of NielsenIQ. The algorithm we present is patent pending.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
Appendix A: An iterative projection property
As a byproduct of our investigation, we present in this appendix an iterative projection property which contains the result of [4] as a special case and may thus be interesting in its own right.
Proposition 10
Assume \(P_m^n\) has the minimizer \(x^{\star }_m,\ldots ,x^{\star }_n \in {\mathbb {R}}\) and let \(\ell \le u\) denote two additional bounds \(\ell ,u\in {\mathbb {R}}\), which also satisfy \(\ell \le u_m\) and \(\ell _n\le u\). Then, \(P_m^n\) augmented by the uniform interval constraints
is consistent with the minimizer
Collecting elements for the proof of Proposition 10, we start with a corollary to Lemma 5.
Corollary 11
Let \(x^{\star }_m,\ldots ,x^{\star }_n\) minimize \(P_m^n\), let \(k\in \{m,\ldots ,n\}\) and let \(y_m,\ldots ,y_k\) minimize \(P_m^k\). Then we have \(x^{\star }_k\le y_k\).
Proof
The case \(k=n\) follows directly from part 4. of Lemma 1. So we assume \(k<n\) and let \(z_{k+1},\ldots ,z_n\in {\mathbb {R}}\) minimize \(P_{k+1}^n\). Consider two cases:
-
1.
In case \(y_k\le z_{k+1}\), the concatenation \(y_m,\ldots ,y_k,z_{k+1},\ldots ,z_n\) minimizes \(P_m^n\) according to Lemma 2. Since the minimizer is unique according to part 4. of Lemma 1, we conclude \(x^{\star }_k=y_k\).
-
2.
In case \(y_k>z_{k+1}\), Lemma 5 entails \(y_k\ge x^{\star }_k\ge z_{k+1}\).
Both cases entail \(x^{\star }_k\le y_k\) as asserted. \(\square\)
Denote with \((P_{\ell }^u)_m^n\) the problem \(P_m^n\) augmented by (A1). Denote with \((P_{\ell })_m^n\) the problem \(P_m^n\) augmented only by the uniform lower bounds
The following lemma provides further auxiliary results.
Lemma 12
Let \(m\le h\le k\le n\) and retain the assumptions of Proposition 10.
-
1.
The problem \((P_{\ell }^u)_h^k\) is consistent and has the equivalent representation (1), (2) and
$$\begin{aligned} {\bar{\ell }}_i\le\,&x_i\le {\bar{u}}_i&i&=h,\ldots ,k \end{aligned}$$(A4)with \({\bar{\ell }}_i=\max (\ell _i,\ell )\le \min (u_i,u)={\bar{u}}_i\) for \(i=h,\ldots ,k\). In this form, the interval constraints of \((P_{\ell }^u)_h^k\) satisfy parts 2. and 3. of Lemma 1. For \((P_{\ell })_h^k\), corresponding assertions hold.
-
2.
Let the minimizer \(x^{\star }_h,\ldots ,x^{\star }_k\) of \(P_h^k\) also satisfy (A1) for \(i=h,\ldots ,k\). Then it minimizes \((P_{\ell }^u)_h^k\).
-
3.
Let the minimizer \(x^{\star }_h,\ldots ,x^{\star }_k\) of \(P_h^k\) also satisfy (A3) for \(i=h,\ldots ,k\). Then it minimizes \((P_{\ell })_h^k\).
-
4.
Let the minimizer \(x^{\star }_h,\ldots ,x^{\star }_k\) of \((P_{\ell }^u)_h^k\) satisfy \(x^{\star }_i<u\) with strict inequality for \(i=h,\ldots ,k\). Then it minimizes \((P_{\ell })_h^k\).
Proof
1: With parts 2. and 3. of Lemma 1 for \(P_h^k\) and the assumptions, we find
Hence, \({\bar{\ell }}_i=\max (\ell _i,\ell )\le \min (u_i,u)={\bar{u}}_i\) for \(h\le i\le k\) follows. Moreover, (4) and (A1) are clearly equivalent to (A4). Remarking also \({\bar{\ell }}_i\le {\bar{\ell }}_j\) and \({\bar{u}}_i\le {\bar{u}}_j\) for \(h\le i\le j\le k\), the problem with objective function (1) and constraints (2), (A4) is consistent by part 3. of Lemma 1.
2: Consider the dual variables associated with the minimizer \(x^{\star }_h,\ldots ,x^{\star }_k\). Concatenating dual variables with value 0 for the additional affine inequality constraints (A1), KKT are clearly still satisfied.
3: the argument is analogous to that of part 2.
4: Consider the dual variables associated with the minimizer \(x^{\star }_h,\ldots ,x^{\star }_k\). By complementary slackness, those associated with the constraints \(x_i\le u\) as \(i=h,\ldots ,k\) take value 0. Hence we can drop the corresponding terms from the first-order conditions, retaining KKT for \((P_{\ell })_h^k\). \(\square\)
We come to the proof of Proposition 10.
Proof
The case \(\ell =u\) is straightforward, so we assume \(\ell <u\).
Let the index values \(m\le h\le k\le n+1\) be defined such that
According to Proposition 3, \(x^{\star }_h,\ldots ,x^{\star }_{k-1}\) minimize \(P_h^{k-1}\). By part 2. of Lemma 12, they minimize \((P_{\ell }^u)_h^{k-1}\).
According to part 1. of Lemma 12, Lemma 2 applies to \((P_{\ell }^u)_m^n\). Hence to complete the proof of Proposition 10, it suffices to show that \(y_i=\ell\) for \(i=m,\ldots ,h-1\) minimize \((P_{\ell }^u)_m^{h-1}\) and \(y_i=u\) for \(i=k,\ldots ,n\) minimize \((P_{\ell }^u)_k^n\). We only prove the latter claim, the former follows analogously.
On the one hand, remark that \(x^{\star }_k,\ldots ,x^{\star }_n\) minimize \(P_k^n\) by Proposition 3. Thus they minimize \((P_{\ell })_k^n\) by part 3. of Lemma 12.
On the other hand, denote by \(z_k,\ldots ,z_n\) the minimizer of \((P_{\ell }^u)_k^n\), remarking \(z_k\le \cdots \le z_n\le u\) by feasibility. Aiming for a contradiction, we assume there is an index \(j\in \{k,\ldots ,n\}\) with \(z_j<u\) and we choose j maximally. That implies either \(j=n\) or \(z_j<u=z_{j+1}\). By Proposition 3, \(z_k,\ldots ,z_j\) minimize \((P_{\ell }^u)_k^j\) in both cases. Part 4. of Lemma 12 shows that \(z_k,\ldots ,z_j\) also minimize \((P_{\ell })_k^j\).
Corollary 11 applies to the minimizer \(x^{\star }_k,\ldots ,x^{\star }_n\) of \((P_{\ell })_k^n\) and the minimizer \(z_k,\ldots ,z_j\) of \((P_{\ell })_k^j\), yielding \(x^{\star }_j\le z_j\). This contradicts our assumptions \(z_j<u\le x^{\star }_j\), thus showing \(z_k=\cdots =z_n=u\) after all and completing the proof. \(\square\)
Appendix B: The approach of [8]
For ease of reference and comparison, we formulate “isotonic regression breakpoints”, the approach of Section 7.2, [8, pp. 202] for a consistent problem with objective function (1) and constraints (2), (3) with initial index \(m=1\).
The preprocessing of weights and the feasibility check of [8, p. 197] coincide with (4) and part 3. of Lemma 1, respectively. They thus transform the problem into \(P_1^n\) and confirm its consistency.
With the notation \(f({\textbf{x}})=\sum _{i=1}^nf_i(x_i)\) of [8, p. 192] for the separable objective function, the convex extensions of \(f_i\) as \(i=1,\ldots ,n\) according to [8, p. 197] take the form
for \(x\in {\mathbb {R}}\), where
is a sufficiently large constant. Since we consider continuous decision variables, we replace the “one-sided discrete subgradients” \(f_i(x+1)-f_i(x)\) of [8, p. 197] by the ordinary subgradients. Since \(f_i\) is almost everywhere differentiable, these subgradients collapse to the derivatives, one-sided only at \(\ell _i\) and \(u_i\). In our quadratic case, we thus find
where \(i=1,\ldots ,n\) and
denotes the indicator function of the logical statement A.
As noted there, the partial sums \(F_0=0\) and \(F_k=F_{k-1}+f'_k\) for \(k=1,\ldots ,n\) of [8, p. 203] as well as their differences \(F_k-F_{h-1}\) for \(1\le h\le k\le n\) are sums of the isotonic derivates (B5) and hence themselves isotonic. We determine explicit representations:
Lemma 13
Let \(1\le h\le k\le n\) and \(x\in {\mathbb {R}}\). Then we have
for \(x<\min \{\ell _k,u_h\}\) and
for \(x>\max \{\ell _k,u_{h}\}\). Assuming \(\ell _k\le u_h\), we also have
Remark that we give no representation of \(F_k(x)-F_{h-1}(x)\) in case \(u_h<x<\ell _k\).
Proof
We apply finite induction on \(k= h,\ldots ,n\). For \(k=h\), we have \(F_k-F_{k-1}=f'_k\) and \(\ell _k\le u_k\) by consistency. The assertions follow directly from (B5). For the inductive step \(k\rightarrow k+1\le n\), remark
Also recall part 2. of Lemma 1. In case \(x<\min \{\ell _{k+1},u_h\}\), we have
The claim now follows after adding \(f'_{k+1}(x)=-M\) for \(x<\ell _{k+1}\) on both sides.
The case \(x>\max \{\ell _k,u_h\}\) is handled analogously.
In case \(\ell _{k+1}\le x\le u_h\), recall \(\ell _k\le \ell _{k+1}\): as long as \(\ell _{k+1}\le u_h\) holds, \(\ell _k\le \ell _{k+1}\le u_h\) is guaranteed. We conclude
\(\square\)
We are now in the position to investigate the “breakpoints”
adapted from [8, p. 203] to our case of continuous decision variables.
Corollary 14
Let \(1\le h\le k\le n\).
-
1.
Assuming \(\ell _k\le u_h\), we have \(\ell _k\le {\bar{x}}^{\diamond }_{hk}\le u_h\) and \({\bar{x}}^{\diamond }_{hk}=\min \bigl (\max (\ell _k,{\bar{x}}^{\circ }_{hk}),u_h\bigr )\) with the weighted average \({\bar{x}}^{\circ }_{hk}\) being familiar from (9).
-
2.
In case \(u_h<\ell _k\), we have \({\bar{x}}^{\diamond }_{hh}\le u_h\le {\bar{x}}^{\diamond }_{hk}\).
Proof
1: Under \(\ell _k\le u_h\), (B6) shows \(F_k(x)-F_{h-1}(x)<0\) for \(x<\ell _k\) and (B7) shows \(F_k(x)-F_{h-1}(x)>0\) for \(x>u_h\). With (B9), \(\ell _k\le {\bar{x}}^{\diamond }_{hk}\le u_h\) follows. In this interval, \(F_k-F_{h-1}\) is continuous by (B8). So we can determine \({\bar{x}}^{\diamond }_{hk}\) by bounding the unconstrained solution \({\bar{x}}^{\circ }_{hk}\) of (B8) for \(F_k(x)-F_{h-1}(x)=0\).
2: We have \(F_h(x)-F_{h-1}(x)=f'_h=M>0\) for \(x>u_h\), yielding \({\bar{x}}^{\diamond }_{hh}\le u_h\). Under \(u_h<\ell _k\), (B6) entails \(F_k(x)-F_{h-1}(x)<0\) for \(x<u_h\), hence \(u_h\le {\bar{x}}^{\diamond }_{hk}\). \(\square\)
Specialized to our situation, Theorem 5.1, [8, p. 197] and Lemma 7.1, [8, p. 203], show that the procedure “isotonic regression breakpoints” from [8, p. 203] solve consistent instances of \(P_1^n\). Algorithm 3 sketches a “naive implementation” of this procedure.
Part 2. of Corollary 14 shows that for fixed \(k=1,\ldots ,s\), the condition \(\ell _{j}\le u_{n_{k-1}}\) does not increase the minimal choice among the breakpoints \({\bar{x}}^{\diamond }_{n_{k-1},j}\) in line 5 as \(j=n_{k-1},\ldots ,n\).
Since the choice in line 6 is maximal, it ensures strict isotony of \({\bar{x}}^{\star }_k\). Remark that these quantities necessarily coincide with those we considered in (10) and (11) of Proposition 3.
Hochbaum and Queyranne [8, p. 204] present an implementation which attains the complexity stated in the introductory Sect. 1.
The following example shows that we cannot dispense with the explicit check of \(\ell _{j}\le u_{n_{k-1}}\) in line 6 of the algorithm.
Example 15
Consider the problem:
From \(f'_1=f'_2\) and \(f'_1(x)=M\) for \(1<x\) alongside \(f'_1(x)\le x-3\) for \(x\le 1\), we conclude \({\bar{x}}^{\diamond }_{11}={\bar{x}}^{\diamond }_{12}=1\). Then we find \(F_3(x)\ge M\) for \(1<x\) and \(F_3(x)\le 2x-6-M\) for \(x\le 1\), which implies \({\bar{x}}^{\diamond }_{13}=1\) as well, despite \(u_1=1<2=\ell _3\). Remark that the alternative definition
instead of (B9) would not change these results.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kopperschmidt, K., Stacevičius, R. Pooling adjacent violators under interval constraints. Optim Lett 18, 257–277 (2024). https://doi.org/10.1007/s11590-023-01988-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11590-023-01988-9