Parametric modeling of quantile regression coefficient functions with count data

Applying quantile regression to count data presents logical and practical complications which are usually solved by artificially smoothing the discrete response variable through jittering. In this paper, we present an alternative approach in which the quantile regression coefficients are modeled by means of (flexible) parametric functions. The proposed method avoids jittering and presents numerous advantages over standard quantile regression in terms of computation, smoothness, efficiency, and ease of interpretation. Estimation is carried out by minimizing a “simultaneous” version of the loss function of ordinary quantile regression. Simulation results show that the described estimators are similar to those obtained with jittering, but are often preferable in terms of bias and efficiency. To exemplify our approach and provide guidelines for model building, we analyze data from the US National Medical Expenditure Survey. All the necessary software is implemented in the existing R package qrcm.


Introduction
The analysis of count data represents an important topic in the statistic literature and numerous textbooks (e.g., Cameron and Trivedi 1998;McCullagh and Nelder 1989;Hastie and Tibshirani 1990;Winkelmann 2005) have been partially or entirely dedicated to the subject. The most well-established approach is to utilize parametric or semiparametric regression models that are typically based on the Poisson distribution and its generalizations, and maximize a pseudo-likelihood to estimate conditional means (Gourieroux et al. 1984). Machado and Santos Silva (2005) proposed analyzing count data using quantile regression (qr; Koenker and Bassett 1978;Koenker 2005). Their approach permits avoiding strong parametric assumptions and enables investigating every aspect of the conditional distribution, and not just its mean. This idea, however, comes with some complications. The fact that the conditional density of the data is not absolutely continuous, in combination with the non-smoothness of the objective function, causes a non-standard rate of convergence (Manski 1975(Manski , 1985 and may as well generate identifiability issues and computational problems. The solution proposed in Machado and Santos Silva's paper is to artificially smooth the data by applying jittering (Stevens 1950). A continuous outcome is generated by adding a random quantity in [0, 1) to the original counts, and estimation and inference are carried out by applying standard qr. A one-to-one correspondence between the quantiles of the counts and those of the artificial data can be established. This method has been applied in various fields including analysis of fertility (Miranda 2008;Booth and Kee 2009), frequency of doctor visits (Moreira and Barros 2010;Winkelmann 2006), car accidents (Qin and Reyes 2011), and capacity of pre-enrollment test to predict students' performance (Grilli et al. 2016). A Bayesian version of jittering has been proposed by Lee and Neocleous (2010). Jittering has been advocated as a computational trick to avoid degenerated solutions (e.g., Koenker 2017), which commonly occur in presence of discrete responses.
Other recent approaches include that of Congdon (2017), in which the asymmetric Laplace distribution is combined with a Poisson model in a Bayesian framework, and the model-based quantile regression of Padellini and Rue (2018), in which quantiles are mapped to the parameters of a generalized linear model identified by a continuous version of a valid count distribution. Tzavidis et al. (2015) proposed a semiparametric M-quantile approach for counts that extends the ideas of Cantoni and Ronchetti (2001) and Breckling and Chambers (2001). These methods avoid jittering, but depend on a limited choice of predefined parametric models.
The idea presented in this paper is to describe the quantile regression coefficients, say bðpÞ, by parametric, smooth functions of p, say bðp j hÞ, using the quantile regression coefficients modeling (QRCM) framework described in Frumento andBottai (2016, 2017). With this approach, the conditional quantile function is modeled parametrically as if the response variable was continuous. The goal is to utilize a ''working model'' that does not reflect the actual data distribution, but permits estimating a smooth quantile function by minimizing a smooth loss function, in the same spirit of Efron (1992).
The proposed method bears some similarities with Padellini and Rue's (2018) approach, in which a statistical model describes a hypothetical continuous counterpart of an originally discrete response. Our modeling framework, however, does not utilize known families of distributions such as Poisson or Negative Binomial, and allows to estimate quantile functions with an arbitrary parametric structure. As shown in the paper, this approach not only permits to avoid jittering, but also generates more efficient estimators, simplifies estimation and inference, and facilitates the interpretation of the results.
The paper is structured as follows. In Sect. 2 we describe the qrcm framework in a general quantile regression situation. In Sect. 3 we show how qrcm can be applied to count data. We describe computation and inference in Sect. 4, and report simulation results in Sect. 5. In Sect. 6 we show the usefulness of the proposed method by analyzing a dataset relating the frequency of doctor's visits to a set of demographic and socio-economic predictors. The R package qrcm implements the described estimator and includes all the necessary functions for inference, prediction, and plotting.

Quantile regression coefficients modeling
We denote by Y i a response variable of interest, and by x i a q-dimensional vector of observed covariates, i ¼ 1; . . .; n. The standard quantile regression (QR) model assumes that is the conditional quantile function of some known, monotone transformation TðÁÞ of Y. Working with a transformed response is common in practice and may be convenient with non-negative or bounded outcomes. Note that, in a general qr framework, the response variable is assumed to be sampled from an absolutely continuous population, which is not true if Y is a count. Estimation of bðpÞ is carried out by minimizing the objective function in which y i is a realization of Y i , and x p;i ¼ IðTðy i Þ x T i bðpÞÞ. Quantile regression does not require distributional assumptions and permits investigating every aspect of possibly asymmetric, heavy-tailed, or multimodal response variables, showing the effect of covariates at different quantiles. Although the distribution-free nature of qr is generally seen as an advantage, it also represents its main weakness. Quantiles are estimated one at a time and no parametric structure is assigned to the coefficient functions bðpÞ. This makes estimation inefficient, generates a large amount of random variability, and causes both the loss function, LðbðpÞÞ, and the estimated coefficients,bðpÞ, to be non-smooth functions of their arguments.
Joint estimation of multiple quantiles has been discussed by numerous authors (e.g., Tokdar and Kadane 2012;Reich 2012;Reich and Smith 2013;Yang and Tokdar 2017;Ghosal 2017, 2018;Fabrizi et al. 2020), and is implemented in the R packages BSquare  and qrjoint (Tokdar and Cunningham 2018). The idea of simultaneous quantile regression is recurrent in the literature on quantile crossing (e.g., He 1997; Bondell et al. 2010;Liu and Wu 2011), and can be used to improve estimation of individual fixed effects (Koenker 2004). Frumento andBottai (2016, 2017) suggested using a fully parametric approach and reformulated model (1) as where h is a vector of model parameters that describe the functional form of the coefficient functions, bðpÞ ¼ bðp j hÞ. This approach is referred to as quantile regression coefficients modeling (QRCM) and is implemented in the qrcm R package (Frumento 2020). An estimate of h can be obtained by minimizing which is the integral, with respect to the order of the quantile, of the loss function of standard quantile regression displayed in (2). Unlike LðbðpÞÞ, the loss defined in Eq. (4) is a smooth function of its arguments, which permits using Newton-type algorithms to carry out minimization, and applying the standard theory of M-estimator (e.g., Newey and McFadden 1994) to derive and implement asymptotics. An illustration of the parametric approach described in model (3) is given in Fig. 1, which is constructed using simulated data. On the left, we report the estimated qr coefficients of order p ¼ 0:01; 0:02; . . .; 0:99. On the right, we show a parametric fit based on a cubic function, bðp j hÞ Describing coefficients by smooth functions with closed-form mathematical expressions comes with an obvious gain in efficiency, especially in the tails, and makes it simpler to report and interpret the results.  Fig. 1 Comparison between qr and qrcm. Left: coefficient function estimated by standard quantile regression. Right: the coefficient is modeled by a cubic function, bðp j hÞ ¼ h 0 þ h 1 p þ h 2 p 2 þ h 3 p 3 , and estimation is carried out by minimizing LðhÞ (Eq. 4). Shaded areas represent pointwise confidence intervals

Quantile regression coefficients modeling with counts
When qr is applied to count data, a non-standard rate of convergence is obtained as a result of the non-smoothness of the objective function in conjunction with the discreteness of the response variable. The problem was undertaken, among others, by Manski (1975Manski ( , 1985 and Huber (1981). In their paper, Machado and Santos Silva (2005) suggested generating an artificial continuous variable Y Ã ¼ Y þ U by adding a quantity U 2 ½0; 1Þ to the original counts Y, a procedure that was referred to as jittering (Stevens 1950). The most common choice is to define U to be a uniform random variable, independent of Y and x. A quantile regression model of the form is assumed to hold for some known monotone transformation TðÁÞ of Y Ã . Because Y Ã is continuous, ordinary quantile regression can be applied to TðY Ã Þ and standard asymptotic theory holds. Given an estimatebðpÞ of bðpÞ, quantiles of the original count Y are consistently estimated bŷ where dae denotes the ceiling operator. Because the value ofbðpÞ depends on the specific realization fug n i¼1 of U, it is preferable to computebðpÞ as the average estimate across m jittered samples (average-jittering). Another solution is to adopt the Bayesian framework described by Lee and Neocleous (2010), in which new values of U are simulated at each iteration of the Monte Carlo Markov Chain algorithm.
Alternative approaches, that are briefly discussed in Machado and Santos Silva's paper, aim to replace LðbðpÞÞ with a smooth objective function. This can be obtained by replacing the indicator functions x p;i ¼ IðTðy Ã i Þ x T i bðpÞÞ (Eq. 2) with a smooth counterpart (e.g., an integrated kernel), or by directly using a different loss, such as the asymmetric maximum likelihood proposed by Efron (1992).
In this paper, we suggest applying the qrcm framework to model a discrete response, using a parametric quantile function as if the data were generated from an absolutely continuous population. The objective is that of imposing some degree of smoothing to the assumed distribution, without altering the response itself.
Our proposal is based on the empirical evidence that almost identical estimators are obtained by applying qrcm to the jittered response, Y Ã ¼ Y þ U, and directly to Y ¼ Y þ E½U. Such equivalence results from the parametric structure that is imposed to the quantile function, which permits ''smoothing away'' the points of mass in the empirical distribution of the data.
Without loss of generality, we assume E½U ¼ 0:5 and apply model (3) to some transformation TðÁÞ of Y ¼ Y þ 0:5, Here, bðp j hÞ is a vector of parametric coefficient functions that are assumed to be continuous and differentiable functions of p. The resulting quantile function is itself continuous, and depends on the covariates according to the same modeling structure used in standard quantile regression. The function to be minimized, LðhÞ, is given in Eq. (4) and, unlike LðbðpÞÞ, is also continuous and continuously differentiable.
After an estimateĥ of h has been computed, quantile regression coefficients are obtained asbðpÞ ¼ bðp jĥÞ, and Eq. (6) can be used to estimate the quantiles of Y, By definition, model (7) cannot be the true data-generating process and should be thought of as a ''working model''. Although an interpretation in terms of a latent continuous variable is possible, the idea of fitting a continuous quantile function to a discrete outcome should be regarded as a computational expedient and can be seen as an implicit way of performing jittering.

A toy example
Suppose that a discrete response variable Y is uniformly distributed on the support f0; 1; 2; . . .9g.
Simple algebra permits showing that the minimizer of LðhÞ under this model solves n À1 P i minðy i =h; 1Þ 2 ¼ 1=3 and is a function of the quadratic mean of the data. It can be easily shown that, asymptotically, the same estimators of quantiles are obtained with the following three methods: (1) standard qr applied to Y Ã ; (2) qrcm estimator applied to Y Ã ; and (3) qrcm estimator applied to Y ¼ Y þ ffiffiffiffiffiffiffi ffi 903 p =6 À 4:5 % Y þ 0:508. To prove this result, just note that ð3E½ðY Ã Þ 2 Þ 1=2 ¼ ð3E½ðY Þ 2 Þ 1=2 ¼ 10. While the equivalence between (1) and (2) is straightforward, point (3) shows that the same estimator can be obtained without jittering.

Model building
As shown in simulations in Sect. 5, a ''good'' parametric model may outperform standard qr with uniform jittering in terms of bias and standard error. Defining a parametric model for a quantile function, however, is not trivial. Numerous alternative strategies to model bðp j hÞ parametrically are presented in Frumento and Bottai's (2016;2017) papers, and the excellent book by Gilchrist (2000) can also be used for inspiration. The coefficient functions are usually simple, often monotone, and can sometimes be well approximated by linear or even constant functions. On the other hand, when flexibility is needed, polynomials or spline functions can be used. Some sensible options for model building are reported in Table 1.
Note that standard qr estimators can be obtained as a special case of qrcm by allowing bðp j hÞ to be an arbitrarily flexible function of p. The linear regression model, in which Y $ Nðb 0 þ b 1 x 1 þ b 2 x 2 þ Á Á Á ; r 2 Þ, is also a special case of model (3) and corresponds to the parametric quantile function defined by In this section, we describe a general strategy to formulate parametric quantile regression models that can be applied to count data. We assume to fit the working model defined by (7), In their 2005 paper, Machado and Santos Silva suggested modeling the following quantity: is the jittered variable, and some small positive number. This transformation is justified by the fact of using uniform jittering and, being a function of p, is only convenient when quantiles are estimated one at a time.
Here we suggest two alternative options: (i) to directly model Y , letting TðÁÞ ¼ IðÁÞ; and (ii) to use a log transformation, TðY Þ ¼ logðY Þ, by analogy with the standard link function of log-linear models. The choice is mainly determined by whether the association with the covariates is assumed to be linear or log-linear. Note, however, that a log-linear association could be well approximated by a linear model in which the covariates have been suitably transformed, e.g., by replacing them by the corresponding spline basis.
To formulate a parametric model for the coefficient functions, bðp j hÞ, we suggest using the following linear parametrization: bðp j hÞ ¼ hbðpÞ; where bðpÞ ¼ b 1 ðpÞ; . . .; b k ðpÞ ½ T is a set of k known functions of p, and h is a q Â k matrix with entries h jh , j ¼ 1; . . .; q, h ¼ 1; . . .; k. The j-th regression coefficient is given by b j ðp j hÞ ¼ h j1 b 1 ðpÞ þ Á Á Á þ h j k b k ðpÞ, and the quantile function is rewritten as This parametrization can prove flexible and computationally convenient (Frumento andBottai 2016, 2017). Consider, for example, a regression model with a single covariate x: Some distributional assumptions can be directly translated into a parametric model for b 0 ðp j hÞ and b 1 ðp j hÞ. For instance, if the working model assumes ðY À h 1 xÞ $ Expð1=h 0 Þ, the quantile function is Q Y ðp j x; hÞ ¼ Àh 00 logð1 À pÞþ h 11 x, which can be written as identifying bðpÞ ¼ 1; À logð1 À pÞ ½ T as the ''basis'' to be used for model building. To avoid strong parametric assumptions, it may be preferable to formulate a more flexible working model: b 0 ðp j hÞ ¼h 00 À h 01 logð1 À pÞ þ h 02 sin ðppÞ þ h 03 cos ðppÞ; b 1 ðp j hÞ ¼h 10 þ h 14 p: In this model, the intercept is described by the quantile function of an Exponential distribution, À logð1 À pÞ, that determines the shape of the right tail, and a combination of trigonometric functions; while the coefficient associated with x is assumed to be linear. In matrix form, the model is defined by : Note that this quantile function does not correspond to any known, closed-form family of probability distribution.
An alternative flexible parametrization is given by b 0 ðp j hÞ ¼h 00 þ h 01 p À h 02 logð1 À pÞ þ h 03 log p; b 1 ðp j hÞ ¼h 10 þ h 11 p þ h 14 ðp À 0:25Þ 3 þ h 15 ðp À 0:75Þ 3 : Here, the intercept is modeled by a quantile function that includes as special cases that of the (shifted) Exponential (h 01 ¼ h 03 ¼ 0), the asymmetric Logistic (h 01 ¼ 0), the standard Logistic (h 01 ¼ 0; h 02 ¼ h 03 ), and the Uniform (h 02 ¼ h 03 ¼ 0). The slope of x is a combination of linear and cubic functions. If x ! 0, then h ! 0 is a sufficient condition for Q TðY Þ ðp j x; hÞ to be monotonically increasing. This makes it simple to control for quantile crossing, that represents a common issue in standard quantile regression.
Possible choices of bðpÞ include polynomials p; p 2 ; p 3 ; . . . ½ , splines, piecewise linear functions, roots ½p 1=2 , ð1 À pÞ 1=2 , p 1=3 , ð1 À pÞ 1=3 ; . . ., logarithms logðpÞ; À logð1 À pÞ ½ , trigonometric functions cosðppÞ; sinðppÞ ½ , quantile functions of known distribution (e.g., that of a Beta or Gamma distribution), and combinations of the above. The intercept, b 0 ðp j hÞ, is more frequently modeled using unbounded functions, while the coefficients associated with x are usually assumed to be bounded and, in some situations, can be described by linear or even constant functions. A variety of modeling options will be used in the application described in Sect. 6.
As shown in the above examples, a common feature of parametric quantile functions is that the support of the response, which is identified by quantiles of order p ¼ 0 and p ¼ 1, may depend on estimated parameters. This is always the case, for example, if bðp j hÞ is assumed to be polynomial. In this situation, maximum likelihood estimators do not meet regularity conditions and cannot be computed by standard algorithms, which explains why most methods involving some sort of parametric modeling (e.g., Reich and Smith 2013) use Bayesian inference. This does not represent an issue in the current framework, where the likelihood function is not used. The minimizer of the simultaneous loss function LðhÞ defined in (4) always corresponds to an interior point.

Computation and inference
Define the model as in (7), where Y ¼ Y þ 0:5, and denote by y i ¼ y i þ 0:5, i ¼ 1; . . .; n, a vector of observed responses. We remark that, since Y is a count while Q TðY Þ ðp j x; hÞ describes a continuous response, the model does not directly reflect the true data-generating process. If an underlying continuous variable Z is invoked such that Y ¼ bZc, where bac denotes the floor operator, then the model may be assumed to correctly describe the quantile function of Z. In this case, however, Y ¼ Y þ 0:5 should be considered interval-censored between Y À 0:5 and Y þ 0:5. Here we avoid questioning whetherĥ consistently estimates a true parameter h 0 , and treat Q TðY Þ as a working model.
As shown in (4), an estimateĥ of h is computed as the minimizer of Numerical integration can be used to evaluate LðhÞ, and Newton-type algorithms can be applied to perform optimization. When bðp j hÞ ¼ hbðpÞ as in model (9), the following expression can be obtained (Frumento and Bottai 2016): BðuÞdu: In the above formulas, p i ¼ FðTðy i Þ j x i ; hÞ corresponds to the cumulative distribution function of y i evaluated at h, and can be obtained as the inverse of Q TðY Þ . In the implementation of the qrcm R package, BðpÞ and B are evaluated numerically, and a bisection algorithm is used to compute p i at the current estimate of h.
Following the standard theory of M-estimators (e.g., Newey and McFadden 1994), an estimate of covðĥÞ can be obtained as whereĤ is the matrix of second derivatives of LðhÞ, evaluated atĥ, andX is the outer product of the summands of the gradient. The exact expressions for H and X, of whichĤ andX are the sample counterparts, are provided in Frumento and Bottai (2016). Note that, unlike standard qr, the asymptotic covariance matrix of qrcm estimator does not involve nuisance parameters, which makes it unnecessary to estimate the sparsity function (e.g., Koenker and Machado 1999) or to use bootstrap to perform inference. If bðp j hÞ ¼ hbðpÞ as in model (9), an estimate of covðbðp j hÞÞ is easily obtained from d covðĥÞ by using quadratic forms. Obviously, the described inferential procedures are imperfect, as they ignore the fact that Q TðY Þ is not the true quantile function. However, simulation results show that reliable estimates of the standard errors are obtained even with a relatively small sample size.
For each scenario, we simulated R ¼ 1000 datasets of size n ¼ 300. For each simulated dataset we computed: (1) the qr estimators of bðpÞ, using average jittering with m ¼ 100 replicates; (2) the qrcm estimators, defining bðp j hÞ as for the ''true'' model; and (3) the qrcm estimators, defining bðp j hÞ to be a third-degree shifted Legendre's polynomial (e.g., El Attar 2009), an orthogonal polynomial in (0, 1) that can be used to formulate flexible models for the coefficient functions. qr estimators were applied to the response variable Y Ã ¼ Y þ U with U $ U½0; 1Þ, while qrcm were applied to Y ¼ Y þ 0:5.
In Tables 2 and 3, we report the average estimates of bðpÞ at different values of p, and the corresponding standard errors. When the ''true'' model was fitted, qrcm estimators were much more efficient than standard qr, and their average was closer to the ''true'' parameters. When a flexible model was used instead, the performance Results of simulation 1. For p ¼ ð0:10; 0:25; 0:50; 0:75; 0:90Þ, we report the mean and standard error (se) of qr and qrcm estimators of bðpÞ ¼ fb 0 ðpÞ; b 1 ðpÞ; b 2 ðpÞg. In the table, qrcm 1 denotes the estimators obtained by fitting the ''true'' model, while qrcm 2 refers to a model in which the coefficient functions are modeled through third-degree shifted Legendre polynomials. The last two columns report the average estimates ( b se) of the asymptotic standard errors of qrcm estimators was more similar to that of average-jittering qr, although some relevant efficiency gains were observed in the right tails where the data were more sparse. This suggests that describing bðp j hÞ by a parsimonious model can substantially improve on standard qr estimators, while overfitting tends to nullify the gain. In the last two columns of each table, only for qrcm estimators, we report the average estimated standard errors computed using the asymptotic covariance matrix defined by Eq. (11). Results suggest that inferential procedures are reliable, although the working model is only an approximation of the true data-generating process. Results of simulation 2

Analysis of NMES data
To illustrate the usefulness of the proposed method, we analyzed data from the US National Medical Expenditure Survey (NMES) conducted in 1987 and 1988 (Deb and Trivedi 1997;Kleiber and Zeileis 2008). The dataset includes a representative sample of n ¼ 4406 US civilians aged ! 66, for which numerous socio-economic indicators (age, gender, marital status, education, income) and indicators of health condition are available. The goal of our analysis was to predict the number of doctor's visits during a one-year period. The response variable had a skewed distribution with a very long right tail (Fig. 2). Higher-order quantiles, which correspond to very frequent doctor's visits, were considered particularly important. We formulated the following quantile regression model: where ''Health'' is a self-perceived health status, with levels ''poor'', ''average'', and ''excellent''; ''Nchronic'' is the number of chronic diseases; ''Male'' is an indicator of male gender; the age was expressed in decades, and centered at its median; ''School'' is the number of years of education, and was centered at its modal value of 12 years; ''Married'' is an indicator of marital status; ''Employed'' is an indicator of employment (about 90% of subjects were retired); the household income (in tens of thousands US dollars) was centered at its median. All individuals in the sample were covered by Medicare, a public insurance program that offers protection against health-related costs. ''Insurance'' is an indicator of whether the subject was also covered by a private insurance; and ''Medicaid'' indicates whether he or she was covered by MedicAid, a federal program complementary to Medicare.
Considering that most predictors were binary, and that the association between the number of doctor's visits and the non-binary covariates appeared to be well approximated by a straight line, we decided not to transform the response variable.
We then formulated a variety of parametric models to be applied to Y ¼ Y þ 0:5. We considered the following alternative parametrizations: j ¼ 0; . . .; 11. All models were formed by the combination of a 2nd-degree polynomial, and a function that could be used to describe a long right tail. Other flexible models could be defined using splines, trigonometric functions, or piecewise-linear functions. We considered a variety of d, namely d ¼ f0:5; 1; 2g in model (i), d ¼ f0:05; 0:10; 0:25g in model (ii), and d ¼ f1; 5; 10g in model (iii). We allowed the parametric form of b 0 ðp j hÞ to differ from that of the other coefficients. For example, we could use model (i) for b 0 ðp j hÞ, and model (ii) for b 1 ðp j hÞ; b 2 ðp j hÞ; . . .. Since all models had the same number of parameters, we selected the ''best'' model based on the minimized loss function. Note, however, that information criteria such as AIC and BIC can also be used for model selection. For general results on information criteria for M-estimators, see Machado (1993); a discussion of criteria for quantile regression models can be found in Lee et al. (2014).
The optimal model had all coefficients parametrized as in (i), with d ¼ 2: The estimated coefficient functions are represented by continuous lines in Figs. 3 and 4. The model parameters are summarized in Table 4, while selected percentiles are reported in Table 5.
Result showed that more frequent doctor's visits were associated with poorer health conditions, female gender (at least at quantiles below 0.75), higher education, and having additional private or public insurances. Age, marital status, and economic indicators did not appear to have a clear association with the response. Importantly, predictors affected the large quantiles much more than the low quantiles of the distribution. For example, the regression coefficient associated with poor health was less than 2 at the median, greater than 3 at the 75th percentile, and greater than 6 at the 95th percentile. The quantile function of two representative individuals, computed using Eqs. (6) and (8), is exemplified in Fig. 5. The estimates obtained using the described qrcm approach were very close to those of ordinary quantile regression with average-jittering, showing that the two methods are virtually equivalent. Using qrcm, however, resulted in a much faster computation and did not require using bootstrap to compute standard errors. Additionally, using a parametric model allowed to describe the coefficient functions by means of simple mathematical equations, improving the efficiency of the estimators and making it easier to summarize and interpret the results. For example, the quantile regression coefficient associated with the male gender indicator is given by b 4 ðpÞ ¼ À0:05 À 1:26p À 0:83p 2 À 1:01 logf1 À p 2 g. In the bottom row, we report the p-values for the null hypothesis fH 0 : h Ãh ¼ 0g, h ¼ 1; . . .; 4, which can be used to assess the significance of each component of bðpÞ. In the last column, we report the p-values for the null hypothesis fH 0 : h jÃ ¼ 0g, j ¼ 0; . . .; 11, which can be used to assess the significance of covariates. The asterisk ( Ã ) denotes significance less than 0.05 and quantile regression coefficients modeling (QRCM). The used parametric model is summarized in Table 4 and illustrated graphically in Fig. 3

. Estimated standard errors in brackets
We showed how the qrcm paradigm can be applied to count data, using a working model in which the assumed quantile function describes a continuous response. This, in combination with the smoothness of the objective function, avoids using jittering, generates efficient estimators, and simplifies inference.
Unlike other forms of model-based quantile regression, in which estimation is carried out in a Bayesian framework, the proposed approach adopts the frequentist paradigm. This is only possible because the minimizers of the objective function LðhÞ, unlike those of the likelihood function, correspond to interior points whether or not the model parameters affect the support of the response. This not only avoids the problem of selecting prior distributions, but may be considered an advantage in fields, like Medicine and Epidemiology, where Bayesian techniques are only used sparely.
In the paper, we only considered a linear quantile regression model of the form Q TðYÞ ðp j x; hÞ ¼ x T bðp j hÞ, and we used a linear parametrization bðp j hÞ ¼ hbðpÞ to describe the regression coefficients. These assumptions could be relaxed, for example by allowing Q TðYÞ ðp j x; hÞ to be a nonlinear function of x or b, or by assuming that bðp j hÞ is a nonlinear function of h. This would not affect the estimation method, but would make computation much more complicated without necessarily representing an advantage in terms of model flexibility. Describing a variety of meaningful parametric quantile functions can be the subject of future research.
Although empirical evidence suggests that parametric models are relatively immune to quantile crossing, the monotonicity of the estimated quantile function is  not generally guaranteed. Some special parametrization (e.g., Reich and Smith 2013;Yang and Tokdar 2017;Das and Ghosal 2017) can be used to avoid crossing. Alternatively, LðhÞ could be minimized subject to monotonicity constraints. This represent an important subject for future work, and a challenge from a computational standpoint. Quantiles are often more interesting than simple measures of location and scale, such as the mean and the variance. For example, many applications aim to describe the tail behavior and the impact of extreme observations. Our proposal may promote the widespread adoption of quantile regression methods in the analysis of count data. Possible applications are found in medicine, epidemiology, life sciences in general, sociology, psychology, and economics.
Providing a user-friendly implementation of the described estimator is an important part of our work. All software used in this paper is implemented in the qrcm R package. The package includes a main function iqr that carries out model fitting, and a variety of auxiliary functions that permit extracting information from the fitted model, performing prediction and extrapolation, plotting the estimated regression coefficients, and obtaining goodness-of-fit measures. The R package is available at http://CRAN.R-project.org/package=qrcm, and upon request to the authors.
Funding Open Access funding provided by Università di Pisa.
Availability of data and material The data are public.

Compliance with ethical standards
Conflicts of interest The authour declares that they have no conflict of interest statement.
Code availability The R code is public.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creativecommons.org/licenses/by/4.0/.