Skip to main content
Log in

Pooling adjacent violators under interval constraints

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

Approaches which “pool adjacent violators” are very simple and efficient methods to solve isotonic regression problems. We extend this type of algorithms to include non-uniform interval constraints in the complete order case. We prove correctness and linear computational complexity of a resulting approach. We also show that a straightforward implementation in C outperforms more general solvers on a sequence of specifically designed problem instances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

The online resource pavicDemo.c generates the data which we study in this note.

Notes

  1. Interval constraints are also known as bound constraints, [1, p. 2], box constraints or variable bounds, [2, p. 129].

  2. To be precise, their result is \(O(n\log U)\) with U no larger than the difference of imposed upper and lower bound. The factor \(\log U\) is typical for integer decision variables, see [8, p. 193].

  3. In Algorithm 1, lines 1 to 8.

  4. We provide the C code and header file as online resources pavic.c and pavic.h. The program pavicDemo.c demonstrates their use. Together with pavicOSQPDemo.c, dykstraDemo.c and gurobi_test.py, it creates the results we present.

References

  1. de Leeuw, J., Hornik, K., Mair, P.: Isotone Optimization in R: Pool-Adjacent-Violators Algorithm (PAVA) and Active Set Methods (2009). https://cran.r-project.org/web/packages/isotone/vignettes/isotone.pdf

  2. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2009)

    Google Scholar 

  3. Hu, X.: Application of the limit of truncated isotonic regression in optimization subject to isotonic and bounding constraints. J. Multivar. Anal. 71, 56–66 (1997)

    Article  MathSciNet  Google Scholar 

  4. Németh, A., Németh, S.: How to project onto the monotone nonnegative cone using Pool Adjacent Violators type algorithms. (2012) arXiv:1201.2343

  5. Chen, X., Lin, Q., Sen, B.: On degrees of freedom of projection estimators with applications to multivariate nonparametric regression. (2018) arXiv:1509.01877v4

  6. Luss, R., Rosset, S.: Bounded isotonic regression. Electron. J. Stat. 11, 4488–4514 (2017). https://doi.org/10.1214/17-EJS1365

    Article  MathSciNet  Google Scholar 

  7. Ahuja, R.K., Orlin, J.B.: A fast scaling algorithm for minimizing separable convex functions subject to chain constraints. Oper. Res. 49, 784–789 (2001)

    Article  MathSciNet  Google Scholar 

  8. Hochbaum, D., Queyranne, M.: Minimizing a convex cost closure set. SIAM J. Discrete Math. 16(2), 192–207 (2003)

    Article  MathSciNet  Google Scholar 

  9. Borwein, J., Lewis, A.: Convex Analysis and Nonlinear Optimization. Springer, New York (1999)

    Google Scholar 

  10. Welford, B.P.: Note on a method for calculating corrected sums of squares and products. Technometrics 4(3), 419–420 (1962)

    Article  MathSciNet  Google Scholar 

  11. Boyle, J.P., Dykstra, R.L.: A method for finding projections onto the intersection of convex sets in Hilbert spaces. Lect. Notes Stat. 37, 28–47 (1986)

    Article  MathSciNet  Google Scholar 

  12. Gurobi Optimization, LLC: Gurobi Optimizer Reference Manual (2022). https://www.gurobi.com

  13. Stellato, B., Banjac, G., Goulart, P., Bemporad, A., Boyd, S.: OSQP: an operator splitting solver for quadratic programs. Math. Program. Comput. 12, 637–672 (2020). https://doi.org/10.1007/s12532-020-00179-2

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We are grateful to Richard Cook, Mark Hogan, Dinesh Mehta, Bruce Richardson, Axel Tenbusch and Carlos Ugaz at NielsenIQ for their encouragement and support. We are particularly indebted to Ludo Daemen at NielsenIQ and to two anonymous reviewers. Their constructive and comprehensive criticism substantially improved our exposition.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai Kopperschmidt.

Ethics declarations

Conflict of interest

We are both employees of NielsenIQ. The algorithm we present is patent pending.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Appendices

Appendix A: An iterative projection property

As a byproduct of our investigation, we present in this appendix an iterative projection property which contains the result of [4] as a special case and may thus be interesting in its own right.

Proposition 10

Assume \(P_m^n\) has the minimizer \(x^{\star }_m,\ldots ,x^{\star }_n \in {\mathbb {R}}\) and let \(\ell \le u\) denote two additional bounds \(\ell ,u\in {\mathbb {R}}\), which also satisfy \(\ell \le u_m\) and \(\ell _n\le u\). Then, \(P_m^n\) augmented by the uniform interval constraints

$$\begin{aligned} \ell&\le x_i\le u&i&=m,\ldots ,n \end{aligned}$$
(A1)

is consistent with the minimizer

$$\begin{aligned} y_i&=\min \bigl (\max (\ell ,x^{\star }_i),u\bigr )&i&=m,\ldots ,n \end{aligned}$$
(A2)

Collecting elements for the proof of Proposition 10, we start with a corollary to Lemma 5.

Corollary 11

Let \(x^{\star }_m,\ldots ,x^{\star }_n\) minimize \(P_m^n\), let \(k\in \{m,\ldots ,n\}\) and let \(y_m,\ldots ,y_k\) minimize \(P_m^k\). Then we have \(x^{\star }_k\le y_k\).

Proof

The case \(k=n\) follows directly from part 4. of Lemma 1. So we assume \(k<n\) and let \(z_{k+1},\ldots ,z_n\in {\mathbb {R}}\) minimize \(P_{k+1}^n\). Consider two cases:

  1. 1.

    In case \(y_k\le z_{k+1}\), the concatenation \(y_m,\ldots ,y_k,z_{k+1},\ldots ,z_n\) minimizes \(P_m^n\) according to Lemma 2. Since the minimizer is unique according to part 4. of Lemma 1, we conclude \(x^{\star }_k=y_k\).

  2. 2.

    In case \(y_k>z_{k+1}\), Lemma 5 entails \(y_k\ge x^{\star }_k\ge z_{k+1}\).

Both cases entail \(x^{\star }_k\le y_k\) as asserted. \(\square\)

Denote with \((P_{\ell }^u)_m^n\) the problem \(P_m^n\) augmented by (A1). Denote with \((P_{\ell })_m^n\) the problem \(P_m^n\) augmented only by the uniform lower bounds

$$\begin{aligned} \ell&\le x_i&i&=m,\ldots ,n \end{aligned}$$
(A3)

The following lemma provides further auxiliary results.

Lemma 12

Let \(m\le h\le k\le n\) and retain the assumptions of Proposition 10.

  1. 1.

    The problem \((P_{\ell }^u)_h^k\) is consistent and has the equivalent representation (1), (2) and

    $$\begin{aligned} {\bar{\ell }}_i\le\,&x_i\le {\bar{u}}_i&i&=h,\ldots ,k \end{aligned}$$
    (A4)

    with \({\bar{\ell }}_i=\max (\ell _i,\ell )\le \min (u_i,u)={\bar{u}}_i\) for \(i=h,\ldots ,k\). In this form, the interval constraints of \((P_{\ell }^u)_h^k\) satisfy parts 2. and 3. of Lemma 1. For \((P_{\ell })_h^k\), corresponding assertions hold.

  2. 2.

    Let the minimizer \(x^{\star }_h,\ldots ,x^{\star }_k\) of \(P_h^k\) also satisfy (A1) for \(i=h,\ldots ,k\). Then it minimizes \((P_{\ell }^u)_h^k\).

  3. 3.

    Let the minimizer \(x^{\star }_h,\ldots ,x^{\star }_k\) of \(P_h^k\) also satisfy (A3) for \(i=h,\ldots ,k\). Then it minimizes \((P_{\ell })_h^k\).

  4. 4.

    Let the minimizer \(x^{\star }_h,\ldots ,x^{\star }_k\) of \((P_{\ell }^u)_h^k\) satisfy \(x^{\star }_i<u\) with strict inequality for \(i=h,\ldots ,k\). Then it minimizes \((P_{\ell })_h^k\).

Proof

1: With parts 2. and 3. of Lemma 1 for \(P_h^k\) and the assumptions, we find

$$\begin{aligned} \ell _i&\le u_i&\ell _i&\le \ell _k\le u&\ell&\le u_h\le u_i&\ell&\le u&i&=h,\ldots ,k \end{aligned}$$

Hence, \({\bar{\ell }}_i=\max (\ell _i,\ell )\le \min (u_i,u)={\bar{u}}_i\) for \(h\le i\le k\) follows. Moreover, (4) and (A1) are clearly equivalent to (A4). Remarking also \({\bar{\ell }}_i\le {\bar{\ell }}_j\) and \({\bar{u}}_i\le {\bar{u}}_j\) for \(h\le i\le j\le k\), the problem with objective function (1) and constraints (2), (A4) is consistent by part 3. of Lemma 1.

2: Consider the dual variables associated with the minimizer \(x^{\star }_h,\ldots ,x^{\star }_k\). Concatenating dual variables with value 0 for the additional affine inequality constraints (A1), KKT are clearly still satisfied.

3: the argument is analogous to that of part 2.

4: Consider the dual variables associated with the minimizer \(x^{\star }_h,\ldots ,x^{\star }_k\). By complementary slackness, those associated with the constraints \(x_i\le u\) as \(i=h,\ldots ,k\) take value 0. Hence we can drop the corresponding terms from the first-order conditions, retaining KKT for \((P_{\ell })_h^k\). \(\square\)

We come to the proof of Proposition 10.

Proof

The case \(\ell =u\) is straightforward, so we assume \(\ell <u\).

Let the index values \(m\le h\le k\le n+1\) be defined such that

$$\begin{aligned} x^{\star }_m\le&\cdots \le x^{\star }_{h-1}\le \ell \\ \ell<x^{\star }_h\le&\cdots \le x^{\star }_{k-1}<u\\ u\le x^{\star }_k\le&\cdots \le x^{\star }_n \end{aligned}$$

According to Proposition 3, \(x^{\star }_h,\ldots ,x^{\star }_{k-1}\) minimize \(P_h^{k-1}\). By part 2. of Lemma 12, they minimize \((P_{\ell }^u)_h^{k-1}\).

According to part 1. of Lemma 12, Lemma 2 applies to \((P_{\ell }^u)_m^n\). Hence to complete the proof of Proposition 10, it suffices to show that \(y_i=\ell\) for \(i=m,\ldots ,h-1\) minimize \((P_{\ell }^u)_m^{h-1}\) and \(y_i=u\) for \(i=k,\ldots ,n\) minimize \((P_{\ell }^u)_k^n\). We only prove the latter claim, the former follows analogously.

On the one hand, remark that \(x^{\star }_k,\ldots ,x^{\star }_n\) minimize \(P_k^n\) by Proposition 3. Thus they minimize \((P_{\ell })_k^n\) by part 3. of Lemma 12.

On the other hand, denote by \(z_k,\ldots ,z_n\) the minimizer of \((P_{\ell }^u)_k^n\), remarking \(z_k\le \cdots \le z_n\le u\) by feasibility. Aiming for a contradiction, we assume there is an index \(j\in \{k,\ldots ,n\}\) with \(z_j<u\) and we choose j maximally. That implies either \(j=n\) or \(z_j<u=z_{j+1}\). By Proposition 3, \(z_k,\ldots ,z_j\) minimize \((P_{\ell }^u)_k^j\) in both cases. Part 4. of Lemma 12 shows that \(z_k,\ldots ,z_j\) also minimize \((P_{\ell })_k^j\).

Corollary 11 applies to the minimizer \(x^{\star }_k,\ldots ,x^{\star }_n\) of \((P_{\ell })_k^n\) and the minimizer \(z_k,\ldots ,z_j\) of \((P_{\ell })_k^j\), yielding \(x^{\star }_j\le z_j\). This contradicts our assumptions \(z_j<u\le x^{\star }_j\), thus showing \(z_k=\cdots =z_n=u\) after all and completing the proof. \(\square\)

Appendix B: The approach of [8]

For ease of reference and comparison, we formulate “isotonic regression breakpoints”, the approach of Section 7.2, [8, pp. 202] for a consistent problem with objective function (1) and constraints (2), (3) with initial index \(m=1\).

The preprocessing of weights and the feasibility check of [8, p. 197] coincide with (4) and part 3. of Lemma 1, respectively. They thus transform the problem into \(P_1^n\) and confirm its consistency.

With the notation \(f({\textbf{x}})=\sum _{i=1}^nf_i(x_i)\) of [8, p. 192] for the separable objective function, the convex extensions of \(f_i\) as \(i=1,\ldots ,n\) according to [8, p. 197] take the form

$$\begin{aligned} f_i(x)&= {\left\{ \begin{array}{ll} \frac{w_i}{2}\cdot (\ell _i-x^{\circ }_i)^2+M\cdot (\ell _i-x) & x<\ell _i\\ \frac{w_i}{2}\cdot (x-x^{\circ }_i)^2 & \ell _i\le x\le u_i\\ \frac{w_i}{2}\cdot (u_i-x^{\circ }_i)^2+M\cdot (x-u_i) & x>u_i\\ \end{array}\right. }&i&=1,\ldots ,n \end{aligned}$$

for \(x\in {\mathbb {R}}\), where

$$\begin{aligned} M&>\sum _{i=1}^nw_i\cdot \max \{|\ell _i-x^{\circ }_i|,|u_i-x^{\circ }_i|\} \end{aligned}$$

is a sufficiently large constant. Since we consider continuous decision variables, we replace the “one-sided discrete subgradients” \(f_i(x+1)-f_i(x)\) of [8, p. 197] by the ordinary subgradients. Since \(f_i\) is almost everywhere differentiable, these subgradients collapse to the derivatives, one-sided only at \(\ell _i\) and \(u_i\). In our quadratic case, we thus find

$$\begin{aligned} f'_i(x)&=w_i\cdot (x-x^{\circ }_i)\cdot I(\ell _i\le x\le u_i) +M\cdot \bigl (I(u_i<x)-I(x<\ell _i)\bigr ) \end{aligned}$$
(B5)

where \(i=1,\ldots ,n\) and

$$\begin{aligned} I(A)&= {\left\{ \begin{array}{ll} 1 & \text {if}\, A\, \text {is true}\\ 0 & \text {otherwise}\\ \end{array}\right. } \end{aligned}$$

denotes the indicator function of the logical statement A.

As noted there, the partial sums \(F_0=0\) and \(F_k=F_{k-1}+f'_k\) for \(k=1,\ldots ,n\) of [8, p. 203] as well as their differences \(F_k-F_{h-1}\) for \(1\le h\le k\le n\) are sums of the isotonic derivates (B5) and hence themselves isotonic. We determine explicit representations:

Lemma 13

Let \(1\le h\le k\le n\) and \(x\in {\mathbb {R}}\). Then we have

$$\begin{aligned} F_k(x)-F_{h-1}(x)&= -M+\sum _{i=h}^{k-1}\bigl ( w_i(x-x^{\circ }_i)\cdot I(\ell _i\le x)-M\cdot I(x<\ell _i)\bigr ) \end{aligned}$$
(B6)

for \(x<\min \{\ell _k,u_h\}\) and

$$\begin{aligned} F_k(x)-F_{h-1}(x)&= M+\sum _{i=h}^{k-1}\bigl (w_i(x-x^{\circ }_i)\cdot I(x\le u_i)+M\cdot I(u_i<x)\bigr ) \end{aligned}$$
(B7)

for \(x>\max \{\ell _k,u_{h}\}\). Assuming \(\ell _k\le u_h\), we also have

$$\begin{aligned} F_k(x)-F_{h-1}(x)&=\sum _{i=h}^k w_i(x-x^{\circ }_i)&\ell _k&\le x \le u_{h} \end{aligned}$$
(B8)

Remark that we give no representation of \(F_k(x)-F_{h-1}(x)\) in case \(u_h<x<\ell _k\).

Proof

We apply finite induction on \(k= h,\ldots ,n\). For \(k=h\), we have \(F_k-F_{k-1}=f'_k\) and \(\ell _k\le u_k\) by consistency. The assertions follow directly from (B5). For the inductive step \(k\rightarrow k+1\le n\), remark

$$\begin{aligned} F_{k+1}-F_{h-1}&=(F_{k+1}-F_k)+(F_k-F_{h-1}) =f'_{k+1}+(F_k-F_{h-1}) \end{aligned}$$

Also recall part 2. of Lemma 1. In case \(x<\min \{\ell _{k+1},u_h\}\), we have

$$\begin{aligned} F_k(x)-F_{h-1}(x)=\Bigl (\sum _{i=h}^k w_i(x-x^{\circ }_i)\Bigr )\cdot I(\ell _k\le x) \\ +\Bigl (-M+\sum _{i=h}^{k-1}\bigl ( w_i(x-x^{\circ }_i)\cdot I(\ell _i\le x)-M\cdot I(x<\ell _i)\bigr )\Bigr )\cdot I(x<\ell _k) \\ =\sum _{i=h}^{k}\bigl ( w_i(x-x^{\circ }_i)\cdot I(\ell _i\le x)-M\cdot I(x<\ell _i)\bigr ) \end{aligned}$$

The claim now follows after adding \(f'_{k+1}(x)=-M\) for \(x<\ell _{k+1}\) on both sides.

The case \(x>\max \{\ell _k,u_h\}\) is handled analogously.

In case \(\ell _{k+1}\le x\le u_h\), recall \(\ell _k\le \ell _{k+1}\): as long as \(\ell _{k+1}\le u_h\) holds, \(\ell _k\le \ell _{k+1}\le u_h\) is guaranteed. We conclude

$$\begin{aligned} F_{k+1}(x)-F_{h-1}(x)&=f'_{k+1}+(F_k-F_{h-1}) \\&=w_{k+1}(x-x^{\circ }_{k+1})+\sum _{i=h}^k w_i(x-x^{\circ }_i) =\sum _{i=h}^{k+1} w_i(x-x^{\circ }_i) \end{aligned}$$

\(\square\)

We are now in the position to investigate the “breakpoints”

$$\begin{aligned} {\bar{x}}^{\diamond }_{hk}&=\inf \{x\in {\mathbb {R}}: F_k(x)\ge F_{h-1}(x)\}&1\le h\le k\le n \end{aligned}$$
(B9)

adapted from [8, p. 203] to our case of continuous decision variables.

Corollary 14

Let \(1\le h\le k\le n\).

  1. 1.

    Assuming \(\ell _k\le u_h\), we have \(\ell _k\le {\bar{x}}^{\diamond }_{hk}\le u_h\) and \({\bar{x}}^{\diamond }_{hk}=\min \bigl (\max (\ell _k,{\bar{x}}^{\circ }_{hk}),u_h\bigr )\) with the weighted average \({\bar{x}}^{\circ }_{hk}\) being familiar from (9).

  2. 2.

    In case \(u_h<\ell _k\), we have \({\bar{x}}^{\diamond }_{hh}\le u_h\le {\bar{x}}^{\diamond }_{hk}\).

Proof

1: Under \(\ell _k\le u_h\), (B6) shows \(F_k(x)-F_{h-1}(x)<0\) for \(x<\ell _k\) and (B7) shows \(F_k(x)-F_{h-1}(x)>0\) for \(x>u_h\). With (B9), \(\ell _k\le {\bar{x}}^{\diamond }_{hk}\le u_h\) follows. In this interval, \(F_k-F_{h-1}\) is continuous by (B8). So we can determine \({\bar{x}}^{\diamond }_{hk}\) by bounding the unconstrained solution \({\bar{x}}^{\circ }_{hk}\) of (B8) for \(F_k(x)-F_{h-1}(x)=0\).

2: We have \(F_h(x)-F_{h-1}(x)=f'_h=M>0\) for \(x>u_h\), yielding \({\bar{x}}^{\diamond }_{hh}\le u_h\). Under \(u_h<\ell _k\), (B6) entails \(F_k(x)-F_{h-1}(x)<0\) for \(x<u_h\), hence \(u_h\le {\bar{x}}^{\diamond }_{hk}\). \(\square\)

Specialized to our situation, Theorem 5.1, [8, p. 197] and Lemma 7.1, [8, p. 203], show that the procedure “isotonic regression breakpoints” from [8, p. 203] solve consistent instances of \(P_1^n\). Algorithm 3 sketches a “naive implementation” of this procedure.

figure c

Part 2. of Corollary 14 shows that for fixed \(k=1,\ldots ,s\), the condition \(\ell _{j}\le u_{n_{k-1}}\) does not increase the minimal choice among the breakpoints \({\bar{x}}^{\diamond }_{n_{k-1},j}\) in line 5 as \(j=n_{k-1},\ldots ,n\).

Since the choice in line 6 is maximal, it ensures strict isotony of \({\bar{x}}^{\star }_k\). Remark that these quantities necessarily coincide with those we considered in (10) and (11) of Proposition 3.

Hochbaum and Queyranne [8, p. 204] present an implementation which attains the complexity stated in the introductory Sect. 1.

The following example shows that we cannot dispense with the explicit check of \(\ell _{j}\le u_{n_{k-1}}\) in line 6 of the algorithm.

Example 15

Consider the problem:

$$\begin{aligned} \text {minimize}\quad f(x_1,x_2,x_3)&=\frac{1}{2}(x_1-3)^2+\frac{1}{2}(x_2-3)^2+\frac{1}{2}(x_3-1)^2 \\ \text {subject to}\quad x_1\le x_2&\le x_3\\ 0\le x_1&\le 1\\ 0\le x_2&\le 1\\ 2\le x_3&\le 3 \end{aligned}$$

From \(f'_1=f'_2\) and \(f'_1(x)=M\) for \(1<x\) alongside \(f'_1(x)\le x-3\) for \(x\le 1\), we conclude \({\bar{x}}^{\diamond }_{11}={\bar{x}}^{\diamond }_{12}=1\). Then we find \(F_3(x)\ge M\) for \(1<x\) and \(F_3(x)\le 2x-6-M\) for \(x\le 1\), which implies \({\bar{x}}^{\diamond }_{13}=1\) as well, despite \(u_1=1<2=\ell _3\). Remark that the alternative definition

$$\begin{aligned} {\bar{x}}^{\diamond }_{hk}&=\inf \{x\in {\mathbb {R}}: F_k(x)> F_{h-1}(x)\}&1\le h\le k\le n \end{aligned}$$

instead of (B9) would not change these results.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kopperschmidt, K., Stacevičius, R. Pooling adjacent violators under interval constraints. Optim Lett 18, 257–277 (2024). https://doi.org/10.1007/s11590-023-01988-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-023-01988-9

Keywords

Navigation