# Simultaneous estimation of quantile curves using quantile sheets

## Authors

- First Online:

- Received:
- Accepted:

DOI: 10.1007/s10182-012-0198-1

- Cite this article as:
- Schnabel, S.K. & Eilers, P.H.C. AStA Adv Stat Anal (2013) 97: 77. doi:10.1007/s10182-012-0198-1

## Abstract

The results of quantile smoothing often show crossing curves, in particular, for small data sets. We define a surface, called a quantile sheet, on the domain of the independent variable and the probability. Any desired quantile curve is obtained by evaluating the sheet for a fixed probability. This sheet is modeled by \(P\)-splines in form of tensor products of \(B\)-splines with difference penalties on the array of coefficients. The amount of smoothing is optimized by cross-validation. An application for reference growth curves for children is presented.

### Keywords

\(P\)-splines Quantiles Smoothing Tensor product## 1 Introduction

In the wake of quantile regression (Koenker 2005; Koenker and Bassett 1978) quantile smoothing has become a popular research subject. It is also an effective tool to study how the conditional distribution of a response variable changes with a covariate like time or age. Many proposals have been published and a range of software implementations are available for the popular R system, e.g. quantreg (Koenker 2011), cobs (Ng and Maechler 2011) or VGAM (Yee 2011).

In theory, conditional quantile curves cannot cross, but in practice they do, especially for small data sets. In many cases this is only a visual annoyance, but it may also jeopardize further analysis, e.g. when studying conditional distributions at specific values of the independent variable.

In the statistical literature one can find several proposals to prevent crossing of quantile curves. Especially in recent years this problem has received considerable attention. Among recent publications on the topic are approaches using natural monotonization (Chernozhukov et al. 2010), non-parametric techniques (Dette and Volgushev 2008; Shim et al. 2008; Takeuchi et al. 2006) as well as constraints enforcing non-crossing (Koenker and Ng 2005).

We propose an alternative approach. The basic idea is to introduce a surface on a two-dimensional domain. One axis is for the covariate \(x\), the other is for the probability \(\tau \). The quantile curve for any probability is found by cutting the surface at that probability. This surface is called a *quantile sheet*. We prefer the name sheet over surface to avoid possible confusion with the generalization of quantile curves to multiple covariates. Effectively, all possible quantile curves are estimated at the same time and the crossing problem disappears completely if the sheet is monotonically increasing with \(\tau \) for every \(x\). We describe the tools to make this happen.

The quantile sheet is constructed as a sum of tensor products of \(B\)-splines. In the spirit of \(P\)-splines (Eilers and Marx 1996), a rather large number of tensor products is used that may generally lead to over-fitting and a quite wobbly sheet, but additional difference penalties on the model coefficients allow proper tuning of the smoothness. We use separate penalties along the \(x\)- and the \(\tau \)-axes, because in general isotropic smoothing will be too restrictive.

The majority of quantile smoothing software is based on linear programming. It is an elegant approach, especially when interior point algorithms are being used. However, we fall back on the classic iteratively re-weighted least squares approach (Schlossmacher 1973) with a small modification. The reason for using Schlossmacher’s proposal is that we wish to apply the fast array algorithm for multidimensional \(P\)-spline fitting (Currie et al. 2006; Eilers et al. 2006). It is not at all clear to us how to combine that with linear programming. Examples in Bassett and Koenker (1992) raise doubts about the convergence of Schlossmacher’s approach, but this seems wrong. We analyzed the examples but found convergence to the right solution. More explanation and an example can be found in Appendix.

With moderate or large amounts of smoothing (in the direction of \(\tau \)) the quantile sheet will be monotonically increasing. This is what we observed when we optimized the smoothing parameters by asymmetric cross-validation. But if one is not willing to trust that all will be well, additional asymmetric difference penalties can be adopted to enforce monotonicity as pioneered by Bollaerts et al. (2006) and Bondell et al. (2010).

The manuscript is structured in the following way: In Sect. 2, first we describe quantile smoothing using \(P\)-splines, in general. Then we present quantile sheets as well as the application of fast array computations and optimal smoothing in this context. Applications to empirical data from a longitudinal study monitoring growth of children can be found in Sect. 3. The manuscript closes with a conclusion and an overview of open questions and further research.

## 2 Model description

### 2.1 Quantile smoothing with \(P\)-splines

Most algorithms for quantile regression and smoothing use linear programming. We wish to avoid that, because when doing the two-dimensional smoothing (see Sect. 2.2) with tensor products of \(B\)-splines, we want to exploit the fast array algorithm GLAM (Currie et al. 2006; Eilers et al. 2006). Schlossmacher (1973) showed how to approximate a sum of absolute values \(S = \sum _i|u_i|\) as a sum of weighted squares \(\tilde{S} = \sum _i u_i^2 / |\tilde{u}_i|\). Here \(\tilde{u}_i\) is an approximation to the solution. The idea is to start with \(\tilde{u}_i \equiv 1\) and to repeatedly apply the approximation until convergence. In practice, it is safer to use the approximation \(u_k^2/\sqrt{\tilde{u}_k^2 + \beta ^2}\), with \(\beta \) a small number of the order of \(10^{-4}\) times the maximum absolute value of the elements of the solution \(\hat{u}\) for numerical stability.

### 2.2 Quantile sheets

When estimating smooth quantile curves as described above, we choose a handful of values for \(\tau \) and separately compute a curve for each of them. Imagine a large number of values of \(\tau \) and a corresponding set of *non-crossing* quantile curves. Taking this to the limit we have a surface, above a rectangular domain defined by the dimensions \(x\) and \(\tau \). If we invert this reasoning, we assume the existence of a surface \(\mu (x; \tau )\), which we call a *quantile sheet* and we have to develop a procedure to estimate it for a given data set.

### 2.3 Computation with array regression

More attractive is array regression as presented in Currie et al. (2006) and Eilers et al. (2006). Here, the construction and manipulation of the Kronecker product of the bases is avoided. The weight matrix \(V\) is kept as it is, and the \(n\) by \(J\) response matrix \(Y\) is formed by \(J\)-times repeating \(y\). The details of the algorithm are complicated and will not be presented here, but we provide a short sketch of its essential features.

### 2.4 Optimal smoothing

## 3 Application

### 3.1 Examples

This type of chart is also often used in practice by pediatricians informing parents about the developmental status of their small children and to judge whether or not the child is developing normally in terms of weight.

### 3.2 A comparison with COBS

## 4 Conclusion

We have introduced a new approach to the estimation of smooth non-crossing quantile curves. Each curve is interpreted as a level curve of a sheet (a surface) above the \((x, \tau )\) plane. Tensor product \(P\)-splines are used to estimate the sheet. Regular difference penalties allow tuning of the smoothness and additional asymmetric difference penalties can enforce monotonicity (in direction of probability \(\tau \)). Application to a real data showed convincing results.

If we take a wider perspective, quantile sheets are an interesting new statistical formalism, extending beyond a tool to avoid crossing curves. In fact, the estimation of quantile curves can be seen as a far from perfect way to estimate the more fundamental quantile sheet. Also, the smooth quantile sheet we obtain, as a sum of tensor products of (cubic) splines, can easily be differentiated and/or inverted to obtain estimates of the joint density or conditional cumulative distributions.

Quantile sheets result in smooth surfaces. Derivatives of our quantile sheets are smooth, too, and piecewise quadratic. The iteratively reweighting approach allows the use of squares of differences in the penalties. The results compare favorably with the piecewise linear derivatives obtained with the triogram smoothing (Koenker 2005) where the penalty is based on total variation (sums of absolute values of differences).

Another advantage of our technique is that it can be extended to the case of two or more covariates resulting in “quantile volumes”. If we write it as \(\mu (x, t; \tau )\) in two dimensions, the quantile surface above the \((x, t)\) plane, for a chosen probability \(\tau ^*\) is obtained by evaluating \(\mu (x, t; \tau ^*)\) at a chosen grid for \(x\) and \(t\).

Instead of asymmetrically weighted absolute values of residuals, one can use asymmetrically weighted squares, resulting in an *expectile sheet*. Expectiles have been introduced in Newey and Powell (1987) as a least squares alternative to quantile curves. Optimal smoothing for expectile curves has been studied in Schnabel and Eilers (2009). We have obtained very promising results for expectile sheets (Schnabel and Eilers 2011).

One issue needs further study. To estimate the quantile sheet, we introduced a grid of values for the probability \(\tau \). The objective function we minimize is a sum of the asymmetric objective functions we know from the estimation of individual smooth quantile curves. The result does depend on the choice of grid for \(\tau \). In the limit, with a large uniform grid, we are approximating an integral over the domain of \(\tau \). A non-uniform grid implies a weighted integral. An interesting question is whether a non-uniform grid is better, and if so, what the optimal grid should be.

In addition, there is the challenge to combine array regression with the interior point algorithm. In principle, this looks feasible, because the core of the interior point algorithm boils down to weighted regression. We hope to work on this in the future.

### Open Access

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.