A constrained regression model for an ordinal response with ordinal predictors

A regression model is proposed for the analysis of an ordinal response variable depending on a set of multiple covariates containing ordinal and potentially other variables. The proportional odds model (McCullagh (1980)) is used for the ordinal response, and constrained maximum likelihood estimation is used to account for the ordinality of covariates. Ordinal predictors are coded by dummy variables. The parameters associated to the categories of the ordinal predictor(s) are constrained, enforcing them to be monotonic (isotonic or antitonic). A decision rule is introduced for classifying the ordinal predictors' monotonicity directions, also providing information whether observations are compatible with both or no monotonicity direction. In addition, a monotonicity test for the parameters of any ordinal predictor is proposed. The monotonicity constrained model is proposed together with three estimation methods and compared to the unconstrained one based on simulations. The model is applied to real data explaining a 10-Points Likert scale quality of life self-assessment variable from ordinal and other predictors.


Introduction
In many situations where regression models are suitable, the relationship between ordinal responses and ordinal predictors is of interest. However, statistical modelling for this type of relationship has called little attention. Even literature for ordinal predictors with any other type of scale of the response variable is scarce (see, for example, Tutz and Gertheiss (2014), and Rufibach (2010)).
In order to account for an ordinal response variable, proportional odds cumulative logit models (McCullagh (1980)) are used here in presence of multiple predictors allowing for different measurement scales. We pay special attention to the treatment of ordinal scale predictors. Their parameter estimates are restricted to be monotonic through constrained maximum likelihood estimation (CMLE). To begin with, consider for simplicity one ordinal response variable y with k categories and one ordinal predictor x with q categories. The corresponding model for this setup is j = 1, . . . , k − 1. α j and β p for p = 2, . . . q are real parameters. The observations are (x i , y i ), i = 1, . . . , n. The x i,p are dummy variables defined as x i,p = 1 if x i falls in the pth category of the ordinal predictor and 0 otherwise, with p = 1, . . . , q. Category number 1 is treated as the baseline category with β 1 = 0; therefore the dummy variable x i,1 = 1 − q p=2 x i,p is omitted and the sum in model (1) starts at p = 2. Monotonicity on {β p } is obtained by using CMLE. The general model is defined in Section 2, which allows for multiple ordinal predictors and other covariates of different measurement scales.
The monotonic effects approach to the ordinal predictors treatment is conceived here as an intermediate point between two general and common approaches within the context of regression analysis on observed variables. One of these common approaches corresponds to an unconstrained version of (1), treating the ordinal predictor as if it were nominal. This ignores the ordinal information. The other common approach treats an ordinal predictor as if it were of interval scale, replacing it by a single transformed variable after applying some scoring method, f . More formally, . This treats f (x) as interval scaled. Numerous data-based methods for scaling of ordinal variables have been proposed in the literature, on top of using plain equidistant Likert scaling (see, e.g., Bross (1958), Harter (1961), Tukey (1962), Hensler and Stipak (1979), Brockett (1981), and Casacci and Pareto (2015)), but ultimately in most situations the data do not carry conclusive information about the appropriateness of any scaling f . The intermediate approach proposed here is defined to achieve a set of linear estimates described by multiple magnitudes, as in the nominal scale approach, but allowing one direction only, as in the interval scale approach. The latter is attained by restricting the effects of the model (1) to be monotonic in either direction. The monotonicity assumption should not necessarily be taken for granted in regression with ordinal predictor and response. But it has a special status, similarly to linearity between interval-scaled variables. According to Stevens (1946) the interval scale is defined by the equality in meaning of differences between values regardless of the location of these differences on the measurement range. A linear relationship between intervalscaled variables means that the impact of a change in the predictor on the response is proportional to the meaning of the change of measurement at all locations of the measurement scale. For the ordinal measurement scale, only the order of measured values is meaningful. In this situation, analogously, if the change of an ordinal predictor has impact on an ordinal response directly in line with the meaning of the change of measurements, the connection would be monotonic.
Some other regression models for ordinal predictors are also based on the monotonic effects assumption. However, models for ordinal responses have not been explicitly discussed in this context. Tutz and Gertheiss (2014) used penalisation methods for modelling rating scales as predictors, and an active set algorithm was proposed by Rufibach (2010) to incorporate ordinal predictors in some regression models considering the response variable to be continuous, binary, or represent censored survival times, and assuming isotonic effects of the ordinal predictors' categories. Another related method is isotonic regression, mostly applied to continuous data (see, for example, Barlow and Brunk (1972), Dykstra et al. (1982), and Stout (2015)). In a broader context, there are some other types of statistical models that deal with ordinal data, such as those in item response theory (IRT) (e.g., Tutz (1990), Bacci et al. (2014)), latent class models (e.g., Moustaki (2000), Moustaki (2003), Vasdekis et al. (2012)), nonlinear principal components analysis (NLPCA) (e.g., De Leeuw et al. (2009), Linting and van der Kooij (2012) and Mori et al. (2016)), and nonlinear canonical correlation analysis (NLCCA) (e.g., Mardia et al. (1979) andDe Leeuw et al. (2009)). However, their settings are somewhat different compared to the one corresponding to modelling an ordinal response with ordinal predictors (and others) in classical linear regression. For instance, unlike IRT models and latent class models, classical regression models do not assume latent variables; and in contrast to NLPCA and NLCCA, classical regression models are not used as a dimensionality reduction technique and need a single dependent variable, respectively.
The monotonicity constrained regression model discussed here can be used for several purposes. When the unconstrained parameter estimates associated to the ordinal predictor are monotonic, then clearly there is no need of a constrained model. However, when these unconstrained estimates are non-monotonic, then there are some reasons why the constrained model could be useful. It is often of interest to compare unconstrained and constrained fits in order to decide whether there is evidence for non-monotonic relationship. In case that the unconstrained version does not provide a clearly better fit, the monotonic fit may be superior regarding interpretability, and may also lead to a smaller mean square error, as will be shown by simulations and a real data application.
In Section 2, the proposed model is developed in detail to obtain both constrained parameter estimates for multiple ordinal predictors and unconstrained estimates for other types of covariates. As the monotonic estimates can be either increasing (isotonic) or decreasing (antitonic), it is necessary to specify this relation while defining the constraints. Also, investigating possible directions of monotonicity for all ordinal predictors is of interest in its own right. Therefore, a monotonicity direction classification (MDC) procedure is introduced in Section 3 that determines the best possible combination of isotonic and/or antitonic associations as a way of assisting the estimation method of the constrained model introduced in Section 2. In Section 4 a monotonicity test is proposed as a complementary tool to assess the validity of the monotonicity assumption of each ordinal predictor. Both the MDC procedure and the monotonicity test provide statistical evidence on the validity of the monotonicity assumption. This can be incorporated in the estimation procedure; Section 5 presents two approaches, one based on the monotonicity test and another one based on the (less conservative) MDC procedure. On the other hand, the same procedures may also detect that the data are consistent with zero influence of a variable, in which case the variable may be dropped, this is treated in Section 5.3. Simulations are presented in Section 6 comparing the mean square error decomposition between the constrained and unconstrained approaches. Finally, the proposed model is applied to real data from the Chilean National Socio-Economic Characterisation in Section 7. A quality of life self-assessment variable using a 10-Points Likert scale is analysed considering ordinal and other predictors.

Proportional odds with monotonicity constraints
For an ordinal response variable y with k categories, let y i be the response category for subject i. The model of proportional odds is j = 1, . . . , k − 1, i = 1, . . . , n. A part of the elements of β corresponds to those effects associated to ordinal predictors categories in x, for which their parameter estimates are constrained to account for monotonicity as explained later.
When this model has one or more of both ordinal and non-ordinal predictors, it can be represented as where x i is a vector with v − t + t s=1 q s elements representing a set of t ordinal predictors (OP) and their t s=1 q s categories together with v nonordinal predictors for the ith observation. Each ordinal predictor is denoted by the subindex s, with s = 1, . . . , t, and contributes q s − 1 dummy variables to the model representing its ordinal categories {1, . . . , q s } assuming the first one as the baseline category, thus β s,1 = 0. Each dummy variable is defined as x i,s,ps = 1 if the ith observation falls in the category p s of the ordinal predictor s and 0 otherwise, with p s = 1, . . . , q s . Therefore, 2,2 , . . . , x i,2,q 2 , . . . , x i,t,2 , . . . , x i,t,qt , x i,u , . . . , x i,v ), where those variables with three indexes correspond to the observation of an ordinal predictor category and those with two are observations of other types of covariates.

Likelihood model fitting
Define π j (x i ) = P (y i = j|x i ), the probability of the response of subject i to fall in category j, and let y i1 , . . . , y ik be the binary indicators of the response for subject i, where y ij = 1 if its response falls in category j and 0 otherwise. Therefore, for independent observations, the likelihood function is based on the product of the multinomial mass functions for the n subjects: Hence, and the log-likelihood function for the model is As we are interested in a constrained version of this model with the aim of getting monotonic increasing/decreasing effects, it is necessary to define the set of constraints to be applied on the t sets of q s coefficients. The isotonic constraints are 0 ≤ β s,2 ≤ · · · ≤ β s,qs , ∀s ∈ I, where I ⊆ S, with S = {1, 2, . . . , t}, and β s,1 = 0. The antitonic constraints are 0 ≥ β s,2 ≥ · · · ≥ β s,qs , ∀s ∈ A, where A ⊆ S, and β s,1 = 0. An estimation method based on a monotonicity direction classification (MDC) procedure will be discussed in Section 3, allocating the ordinal predictors in either of these two subsets, achieving I ∪ A = S. These constraints can be expressed in matrix form as Cβ (ord) ≥ 0. The vector β (ord) is part of the vector β. The latter contains all the parameters associated to the t ordinal predictors and their q s − 1 categories together with the v non-ordinal predictors, β = β (ord) , β (non−ord) , with β (ord) = (β 1 , . . . , β t ) with s = 1, . . . , t, and β (non−ord) = (β 1 , . . . , β v ) with u = 1, . . . , v, where each vector β s = (β s,2 , . . . , β s,qs ) with p s = 2, . . . , q s . The matrix C is a square block diagonal matrix of t s=1 (q s − 1) dimensions composed of t square submatrices C s in its diagonal structure and zeros in its off-diagonal blocks as follows, where and each square submatrix C s has q s − 1 dimensions. Then, the maximisation problem is where 0 is a vector of t s=1 (q s − 1) elements. Now, (10) can be expressed as the Lagrangian where λ is the vector of t s=1 (q s − 1) Lagrange multipliers denoted by λ s,ps . The set of equations to be solved is obtained by differentiating L({α j }, β, λ) with respect to its parameters and equating the derivatives to zero. In order to solve this in R, the package maxLik (Henningsen and Toomet (2011)) offers the maxLik function which refers to constrOptim2. This function uses an adaptive barrier algorithm to find the optimal solution of a function subject to linear inequality constraints such as in (10) (Lange (2010)).

Monotonicity direction classification
Under the monotonicity assumption for all OPs, an important decision to be made is whether each ordinal predictor's set of effects (also referred to as pattern), is either isotonic, namely s ∈ I, or antitonic, s ∈ A. Also outside the context of parameter estimation, it may be of interest whether a predictor is connected to the response in an isotonic or antitonic way, or potentially whether monotonicity may not hold or whether both directions are compatible with the data.
One possible way to deal with this decision is to just maximise the likelihood, i.e., to fit 2 t models, one for each possible combination of monotonicity directions for the t ordinal predictors, and then choose the one with the highest likelihood. However, as the number of ordinal predictors t increases, the number of possible combinations of monotonicity directions becomes greater, which could lead to a considerable number of high dimensional models to be fitted.
Another possible estimation method uses a monotonicity direction classifier to find the monotonicity direction for each ordinal predictor and then fits only one model. This will be based on confidence intervals (CIs) for the parameters and on checking which monotonicity direction is compatible with these. This may miss the best model, but in some situations it may be desirable to take into account fewer than 2 t but more than a single model.
The two approaches are put together in a three steps monotonicity direction classification (MDC) procedure exploiting their best features. Each of the first two steps uses a decision rule with different confidence levels for the CIs, and the last step applies the multiple models fitting process described above over those patterns with no single monotonicity direction established in the previous steps. Before describing its steps, consider some remarks and definitions.
The parameters' CIs from an unconstrained model are the main input for the decision rule proposed here. It is possible to compute the CI defined in equation (12) for the parameters of an unconstrained version of the model (4) (Agresti (2010)). Denote SEβ as the standard error of the parameter estimateβ, then an approximate confidence interval for β with a 100(1−α)% confidence level isβ where zα /2 denotes the standard normal percentile with probabilityα/2. The values forβ and SEβ are obtained by fitting the proportional odds model (McCullagh (1980)) over the unconstrained model (4). The R function vglm of the package VGAM was used here (Yee et al. (2015)). The first two steps of the MDC procedure provide four possible outcomes for each pattern of unconstrained parameter estimates associated to an ordinal predictor's categories: 'isotonic', 'antitonic', 'both', and 'none'. The first two correspond to a classification of monotonicity direction whereas the remaining two to the case where a single direction is not found because either both directions of monotonicity are possible or the parameter estimates' pattern is not compatible with monotonicity, respectively. The idea is that the intersections of all CIs for the parameters of a single ordinal predictor together will either allow for isotonic but not antitonic parameters, or for antitonic but not isotonic parameters, or for both, or for neither. Formally, the MDC of the parameter estimates' pattern is defined as where D s,c = {d s,ps,p s ,c } is defined as the set of distinct values resulting from (14) for the ordinal predictor s considering confidence intervals with a 100c% confidence level, and ∀p s < p s and p s ∈ {2, 3, . . . , q s }, whereŨ s,ps,c is the confidence interval's upper bound of the parameter β s,ps associated to the category p s of the ordinal predictor s given a 100c% confidence level, andL s,ps,c is its corresponding lower bound. Note that, by definition, the first category of all ordinal predictors is set to zero, soL s,1,c =Ũ s,1,c = 0, ∀s. (14) yields 1 when the CI of the parameter β s,ps is fully above the one of β s,p s and consequently their CIs only allow an isotonic pattern; -1 when it is fully below pointing to an antitonic pattern; and 0 when there exists an overlap, meaning that both monotonicity directions are still possible. Each result of (14), denoted as d s,ps,p s ,c , can be understood as an indicator of the relative position of the confidence interval of the parameter β s,ps compared to the one of β s,p s , ∀p s < p s and p s ∈ {2, 3, . . . , q s }, belonging to the same ordinal predictor s and given a 100c% confidence level. As this is a pairwise comparison, there exist q s (q s − 1)/2 indicators for each ordinal predictor s. Equation (13) uses these indicators to classify the monotonicity direction of an ordinal predictor as a whole at a particularc.
The three steps MDC procedure has the following structure: Step 1 Setc at a relatively high 100c% confidence level, say 0.99, 0.95 or 0.90, and apply the MDC (13) to assign the subindexes s either to the set I or A defined in Section 2.1. Therefore, I 1 = {s : d s,c = isotonic} and A 1 = {s : d s,c = antitonic}, where I 1 and A 1 denote the isotonic and antitonic sets resulting from the step 1 respectively. In addition, define B 1 = {s : d s,c = both} and N 1 = {s : d s,c = none}. If (I 1 ∪ A 1 ) = S, then all the ordinal predictors' monotonicity directions have been decided, and there is no need to continue with the MDC procedure. Otherwise, the following step is used for the remaining cases only, (B 1 ∪ N 1 ).
Step 2 Consider the set of ordinal predictors {s : s ∈ (B 1 ∪ N 1 )} and apply the MDC (13) in an iterative manner while varying the confidence level 100c%. A decrease/increase ofc reduces/enlarges the range of the CIs of the parameter β s,ps ∀s ∈ (B 1 ∪ N 1 ) and p s ∈ {2, 3, . . . , q s }. These changes inc produce different effects on the classification depending on whether s ∈ B 1 or s ∈ N 1 , which must be used as follows: (a) For each s ∈ B 1 , the second step is to gradually decreasec while applying the decision rule (13) using a new confidence levelc s instead ofc, obtaining d s,c s . The level ofc s must be gradually decreased until either a pre-specified minimum confidence level referred to as tolerance levelc * s is reached, with 0 <c * s <c, or the ordinal predictor s is classified as either isotonic or antitonic by d s,c s .
(b) Conversely, for each s ∈ N 1 , gradually increasec while applying the MDC (13) using a new confidence levelc s obtaining d s,c s . The level ofc s must be gradually increased until either a higher confidence level referred to as tolerance levelc * s is reached, with c <c * s < 1, or the ordinal predictor s is classified as either isotonic or antitonic by d s,c s . Finally, I 2 = I 1 ∪ {s : d s,c s = isotonic or d s,c s = isotonic} and A 2 = A 1 ∪ {s : d s,c s = antitonic or d s,c s = antitonic}, where the subindex of I 2 and A 2 denotes results from the second step. After completing the second step, if (I 2 ∪ A 2 ) = S, then it is not necessary to continue with step 3 and the MDC procedure ends. If (I 2 ∪ A 2 ) ⊂ S, then the third and final step must be carried out.
Step 3 Fit 2 #{s:s / ∈(I 2 ∪A 2 )} models accounting for possible combinations of monotonicity directions of the ordinal predictors that were not classified as 'isotonic' or 'antitonic', i.e., those in the set {s : s / ∈ (I 2 ∪ A 2 )}, and choose the best model based on some optimality criterion, such as the maximum likelihood as used here.
In general, the MDC procedure describes two levels of decision. The first one is provided by step 1, where a confidence level is applied to all ordinal predictors by the use of a single parameterc. The second one is in step 2, where each ordinal predictor s ∈ (B 1 ∪ N 1 ) is classified based on its own confidence level.
Step 2 allows to classify predictors that were not classified based on the fixed initial confidence level.
In step 2, classifying more parameter estimates' patterns with s ∈ B 1 as either isotonic or antitonic requires a gradual reduction of the confidence level. The tolerance levelsc * s andc * s determine the leeway allowed for the confidence levels in order to enforce a decision. The choice of these may depend on the number of ordinal variables; if the number is small, running step 3 may not be seen as a big computational problem, and it may not be necessary to enforce many decisions in step 2. The tolerance levelc * s should not be too low, less than 0.8, say, because it is not desirable to make decisions based on a low probability of occurrence.
For those s ∈ N 1 in step 2, the researcher does not face such a trade-off, because greater confidence levels could increase (not decrease) the number of new isotonic or antitonic classifications for those s ∈ N 1 .
It is important to reduce (or increase) the confidence level in step 2 in a gradual manner, by 0.01 or 0.005, say, for each iteration. If the chosen intervals in the sequence of confidence levels to be assessed are too thick without assessing intermediate levels, then, for an ordinal predictor s ∈ B 1 , it is possible to switch its classification from 'both' to 'none' instead of updating it from 'both' to either 'isotonic' or 'antitonic'. Conversely, the class of an ordinal predictor s ∈ N 1 could change from 'none' to 'both'. The thinner the intervals in the sequence of confidence levels to be assessed are, the less likely to switch from 'both' to 'none' or 'none' to 'both' is. However, in some specific cases, there still is a probability of having such an undesired class change.
The researcher may also be interested in exploring other monotonicity directions rather than those resulting from the MDC procedure proposed here, although the maximum likelihood attained by the MDC procedure would not be reached. In this case, the correspondence of each ordinal predictor s to either I or A should simply be enforced when constructing C, the matrix of constraints, as described in Section 2.1.
In order to illustrate the MDC procedure, we consider a particular example of model (4) with four ordinal predictors only (t = 4 and v = 0), where q 1 = 3, q 2 = 4, q 3 = 5, q 4 = 6, and k = 4, i.e., j = 1, 2, 3. The parameters are chosen to be α 1 = −1, α 2 = −0.5, and α 3 = −0.1; and These parameters represent a situation in which all covariates are monotonic, with the elements of β 1 and β 2 being isotonic, and those of β 3 and β 4 antitonic patterns. Given monotonicity, the higher the distances between adjacent parameters are, the clearer the monotonicity direction is. In this illustration, these distances were chosen to make the monotonicity direction clear for the first ordinal predictor only and less clear for the remaining ones, s = 3 being the most unclear and challenging case because all of its parameters show little distance between adjacent categories and consequently from zero.
The 2,000 simulated observations of the ordinal predictors were obtained from the population distributions shown in Figure 1.
Using this simulated data set, an unconstrained version of the model was fitted to obtain the parameter estimates and their standard errors, with which a confidence interval can be computed for any level ofα using equation (12).
For the first step of the MDC procedure, the confidence level was set at a highc = 0.99. The resulting confidence intervals allowed to classify the first and second OP as 'isotonic', I 1 = {1, 2}, and the remaining two patterns of parameter estimates as 'both', B 1 = {3, 4}. Figure 2 shows that the latter two ordinal predictors allowed both directions of monotonicity, which is the reason why they were not classified as 'antitonic'. The second step was applied over each ordinal predictor s ∈ B 1 = {3, 4} using the same tolerance level,c * 3 =c * 4 = 0.8. For s = 3, it was not possible to classify its pattern as 'antitonic' before reaching the tolerance level. Therefore, it remained as 'both'. For s = 4, the procedure was applied until reachingc s = 0.96, where  the fourth OP was classified as 'antitonic'. Now, I 2 = {1, 2} and A 2 = {4}. As no monotonicity direction was identified for the third OP, two models were fitted in step 3 of the MDC procedure, one treating the third OP as 'isotonic' and the other one as 'antitonic'. Finally, the model with the highest log-likelihood was selected as the final one.
The procedure successfully classified the ordinal predictors s = 1, 2, 3, 4 as 'isotonic', 'isotonic', 'antitonic', and 'antitonic', respectively, despite the fact that the unconstrained parameter estimates of the last three are not monotonic. Furthermore, it reduced the number of possible models to be fitted from 16 to 2 while making decisions based on individual confidence levels of 96% or greater.
As shown in Figure 2, it is not easy to classify cases like s = 3 where all the parameter estimates are close to zero and their confidence intervals are big enough to make the monotonicity direction classification infeasible for any reasonable tolerance level. In this case, the tolerance level would have needed to be set atc * 3 ≤ 0.53 had we wanted the MDC procedure to classify the third ordinal predictor as either 'isotonic' or 'antitonic'. In fact, when doing so, the MDC makes a mistake and classifies it as 'isotonic'. This relationship between low tolerance levels and misclassification is the main reason why the procedure needs to start with a relatively high confidence levelc s and then gradually decrease it until reaching a reasonable tolerance level if necessary.
In cases like s = 3, one option is to remove this variable from the model because all of the CIs associated to it contain zero even if we choose a tolerance level lower than 0.80, which we consider too low. Removing this variable would have allowed us to fit just one model instead of two. However, removing variables may not be good if the aim is to achieve a model with optimal predictive power.

A monotonicity test
The MDC procedure assists the decision on the choice of an appropriate monotonicity direction assumption for each OP when fitting model (4), but it is not a formal monotonicity test. It relies on the analysis of multiple pairwise comparisons of confidence intervals with flexibly chosen confidence levels without caring about the simultaneous error probability.
When analysing the monotonicity assumption on the parameters associated to an OP s, the Bonferroni correction method can be used to construct a formal monotonicity test for an OP. The Bonferroni correction method allows to compute a set of confidence intervals achieving at least a 100(1 − α * s )% confidence level simultaneously (see Miller (1981), p. 67, andBonferroni (1936)), which is the probability that all the parameters are captured by the confidence intervals simultaneously. For a given ordinal predictor s and a pre-specified α * s , if each one of the q s − 1 confidence intervals is built with a 100(1 − α * s /(q s − 1))% confidence level, then the simultaneous confidence level will be at least 100(1 − α * s )%. The null hypothesis "H 0 : The parameters {β s,ps : p s = 1, 2, . . . , q s } are either isotonic or antitonic" (0 ≤ β s,2 ≤ β s,3 · · · ≤ β s,qs (isotonic) and 0 ≥ β s,2 ≥ β s,3 · · · ≥ β s,qs (antitonic)) is tested against the alternative "H 1 : The parameters {β s,ps : p s = 1, 2, . . . , q s } are neither fully isotonic nor fully antitonic" for a given OP s, and setting β s,1 = 0 as in previous sections.
For a given ordinal predictor s, and taking advantage of the ordinal information provided by its categories, it is then checked whether all the confidence intervals simultaneously are compatible with monotonicity.
In order to identify whether a pair of confidence intervals of β s,ps is incompatible with monotonicity, a slight modification of equations (13) and (14) is used. Now, instead of the confidence levelc, those equations usẽ b = 1 − α * s /(q s − 1). Therefore, the monotonicity test for an ordinal predictor s is where D s,b = {d s,ps,p s ,b } is defined as the set of distinct values resulting from using equation (14) for the ordinal predictor s considering each confidence interval with a 100b% confidence level (instead of 100c%) in order to achieve a simultaneous confidence level of at least 100(1 − α * s )% for the parameters associated to the OP s.
If T s,b = reject H 0 , then the parameters associated to the ordinal predictor s are not compatible with the monotonicity assumption with a simultaneous confidence level of at least 100(1 − α * s )%, whereb = 1 − α * s /(q s − 1). When applying this monotonicity test to the four OPs of the illustration discussed in Section 3 and using a pre-specified α * s = 0.05, all the OPs were found to be compatible with the monotonicity assumption.
For a given pre-determined significance level of α * s (say 0.1, 0.05 or 0.01), the Bonferroni correction will often be very conservative, and the more conservative, the higher the number of ordinal categories being involved in the monotonicity test. This is because each confidence interval is computed with an individual confidence level of 100(1 − α * s /(q s − 1))%. Therefore, a higher q s implies larger ranges of the intervals, making the test more likely to not reject H 0 .
In order to show some results for the monotonicity test with OPs for which their association with the response variable is truly non-monotonic, consider a setting for model (4) with two OPs only (t = 2 and v = 0), where q 1 = 4, q 2 = 5, and k = 4, i.e., j = 1, 2, 3. The parameters for the intercepts are α 1 = −1, α 2 = −0.5, and α 3 = −0.1; and the true sets of parameters of the OPs 1 and 2 represent non-monotonic associations, being β 1 = (0.4, 1.7, 0.8) and β 2 = (−0.25, −0.7, −0.05, 0.40). The distributions among categories of OPs 1 and 2 are the same as the ones described in Figure 1 for OPs 2 and 3 correspondingly, and the number of observations is 2,000.
After fitting the new unconstrained model on 1,000 simulated data sets and testing for monotonicity, the null hypothesis was rejected in 84.9% of the data sets for the OP 1 and in 84.5% for the second OP, in both cases with α * s = 0.05.

Dropping monotonicity constraints using the monotonicity test
The MDC procedure described in Section 3 implies that the parameter estimates of all OPs are restricted to be monotonic. However, the researcher could be interested in imposing monotonicity on a subset of OPs depending on statistical evidence to support this decision, such as the one provided by the monotonicity test. The monotonicity test could be used as a complementary tool to the MDC procedure in order to assist the estimation process. If the researcher is open to the possibility of not imposing the monotonicity constraints on some OPs, then he/she could first test monotonicity on each one of them and then perform the MDC procedure on all those ordinal predictors where the null hypothesis of the monotonicity test was not rejected. Under this scenario, in case that monotonicity is rejected for an OP, it would be more prudent to fit unconstrained estimates on the parameters associated to it. Therefore, such an OP should not be part of S, the set of OPs to be constrained, but rather part of the non-ordinal predictors, considering it at the nominal scale level.

Dropping monotonicity constraints using the MDC procedure
When dropping the monotonicity constraint for some of the OPs is considered as a feasible option, then not only the approach introduced in Section 5.1 could be used, but also an alternative one is proposed in this section. As in the previous section, consider the case where the researcher might also want to explore whether the monotonicity assumption holds for all of the OPs or for a subset of them, but now using a less conservative approach than the monotonicity test. An adjusted version of the MDC procedure described in Section 3 allows to drop the monotonicity assumption for some OPs. There are only two adjustments, one in step 2.b and the other one in step 3. The first one is to setc * s =c, i.e., the tolerance level for each OP s ∈ N 1 is set to be the same as the confidence level chosen in step 1. Therefore, the second step is not performed on any ordinal predictor s ∈ N 1 . The second modification is to apply step 3 over the possible combinations of monotonicity directions of the ordinal predictors that were classified as 'both' by the end of step 2, i.e., the number of models to be fitted is now 2 #{s:d s,c * s =both} instead of 2 #{s:s / ∈(I 2 ∪A 2 )} . This implies that S, the set of OPs to be constrained, must be updated excluding each ordinal predictor s ∈ N 1 from the set of monotonicity constraints. Finally, the model should be fitted considering these OPs as nominal scaled variables.
These adjustments are equivalent to consider the first step of the MDC procedure as a filter of OPs to be constrained, where those that are classified as 'none' by the end of this step are removed from S and excluded from steps 2 and 3.

Using the MDC procedure for variable selection
The parameter estimates' patterns classified as 'both' at the end of the second step of the MDC procedure are also of interest. 'Both' refers to an ordinal predictor for which all of the parameters associated to its categories have CIs containing zero. Therefore, if this is true even for the CIs evaluated at the tolerance level, an option is to remove such an ordinal predictor from the model of interest and apply the MDC procedure again using the new model. If more than one OP is classified as 'both' and there is appetite to drop such variables, then it is advisable to do it in a stepwise fashion such as backward elimination, while checking the results of the MDC procedure in each step, because dropping an OP could affect the monotonicity direction classification of another OP. We will not investigate this in detail here, assuming that the data is rich enough so that variable selection is not required.
Each of the options described in Sections 5.2 and 5.3, i.e., dropping monotonicity constraints for those ordinal predictors s ∈ N 1 and dropping ordinal predictors {s : d s,c * s = both}, reduces the number of models to be fitted in step 3. If these options are applied to all of the ordinal predictors classified as 'both' or 'none', then step 3 would be avoided.

Simulations
The model (4) with two ordinal and two interval scale predictors, where k = 5, i.e., j = 1, 2, 3, 4, was fitted for 1,000 data sets simulated as described in Section 3 using the following parameters: for the intercepts The number of simulated observations for each data set is 1,000. The ordinal predictors were drawn from the population distributions used in Section 3 for those covariates with the same number of ordinal categories, 4 and 6. The interval scale covariates x 1 and x 2 were randomly generated from normal distributions, N (0, 1) and N (5, 4) correspondingly.
For each one of the 1,000 data sets, model (16) was fitted following four different approaches: 1. UMLE: Using unconstrained MLE, which implies not imposing monotonicity constraints on any of the predictors.
2. CMLE: Using constrained MLE, which implies imposing monotonicity constraints on all of the OPs as described in Section 3. In this simulation, the MDC procedure was preformed choosing a 90% confidence level in step 1 (c = 0.90).
3. CMLE Bonferroni: Using constrained MLE imposing monotonicity constraints on those ordinal predictors to which the null hypothesis of the monotonicity test was not rejected as described in Section 5.1.
In the current setting, the chosen simultaneous significance level for the monotonicity test was α * s = 0.05, for s = 1, 2, and the confidence level for the first step of the MDC procedure was 90% (c = 0.90). 4. CMLE filtered: Using constrained MLE imposing monotonicity constraints on those ordinal predictors not classified as 'none' by the first step of the MDC procedure as described in Section 5.2, with the same pre-specified confidence level of 90% (c = 0.90) as in the previous approaches.
After fitting the UMLE, the MDC procedure was performed. Its first step classified the OP 1 as 'isotonic' in 100% of the cases and the OP 2 as 'antitonic' in 98.5%. The remaining 1.5% correspond to the cases where OP 2 was classified as 'none'. However, by the end of the second step, with tolerance levelc * 2 = 0.999, the MDC procedure correctly classified the monotonicity directions for 100% of the data sets with no need of performing its third step.
When applying the monotonicity test to the OPs of model (16), the null hypothesis of monotonicity was not rejected in 100% of the data sets for both OPs with α * s = 0.05. The few cases representing the 1.5% of the data sets where the OP 2 was classified as 'none' by the first step of the MDC procedure produced some differences between the approaches CMLE and 'CMLE filtered'. In contrast, the monotonicity test results did not produce any difference between the approaches CMLE and 'CMLE Bonferroni'.
Consider one of the 1,000 data sets for illustration. As shown in Figure  3, some unconstrained parameter estimates are incompatible with the monotonicity assumptions of this simulated model. Despite the fact that the first ordinal predictor is assumed to be isotonic, the UMLE yieldsβ 1,2 < 0 and β 1,3 >β 1,4 . Similar violations occur with the second ordinal predictor, which is assumed to be antitonic butβ 2,3 <β 2,4 andβ 2,5 <β 2,6 . However, the results of the CMLEs impose the monotonicity assumptions, with the estimate for β 1,2 > 0, the estimate for β 1,4 being slightly greater than the one for β 1,3 , and where the estimates for β 2,4 and β 2,6 are slightly lesser than those for β 2,3 and β 2,5 respectively.
The results of approaches 'CMLE Bonferroni' and 'CMLE filtered' were omitted in Figure 3 because they are exactly the same as the ones of the approach 'CMLE' for this particular example. This is because the results of this simulated data set show that the first step of the MDC procedure did not classify OPs 1 or 2 as 'none', and the monotonicity test did not reject the null hypothesis of monotonicity for any of these two OPs.
The CMLEs for the parameters that are not related to the ordinal predictors are, in general, not the same as the UMLEs. In this particular example, the parameter estimates associated to both intercepts and interval scale covariates are hardly affected by the monotonicity assumption when comparing the CMLE to the UMLE. Figure 4 uses boxplots to visualise the distributions of the parameter estimates of the four approaches listed above together with the parameters used in the data generation process for the 1,000 simulations.
The effect of the monotonicity constraints is depicted by the range of values of the approach CMLE for each one of the ordinal predictor categories' parameters, which differ from the UMLEs in two aspects. The first one is that the parameter estimates are forced to take values being compatible with their monotonicity direction, i.e., positive for the isotonic case and negative for the antitonic one. This is why the boxplots of the CMLE for β 1,2 and β 2,2 seem to be truncated at zero as opposed to the UMLE. The second difference is a generalization of the first one. The lower extremes of the CMLE boxplots make their corresponding whiskers shorter than the ones of the UMLE when there is an isotonic relationship, and the same effect occurs for the upper whiskers when the relationship is antitonic. However, this does not produce big differences between the UMLE and CMLE in terms of their median. Regarding the comparison between the approach 'CMLE Bonferroni' and the unconstrained MLE, exactly the same conclusions hold. This is because the monotonicity tests did not provide statistical evidence against monotonicity for any OP.
The results of the approach 'CMLE filtered' are slightly different from the other constrained approaches because there are 15 cases where the OP 2 was considered as non-monotonic. Therefore, the monotonicity constraints were not imposed on OP 2 when fitting their corresponding models. This is the reason why there are some positive values for the estimate of β 2,2 in the approach 'CMLE filtered'.
Consider the mean square error (MSE) and its squared bias-variance decomposition to compare the performance of the three different approaches of the CMLE with respect to the one of the UMLE. As shown in Figure 5, the total MSE is notably smaller for the CMLE. On average, the CMLE for the intercepts shows a 19.7% smaller MSE compared to the MSE of UMLE, 13.8% smaller for the first ordinal predictor set of CMLEs, and 25.1% smaller for the second one. The same results are for 'CMLE Bonferroni' compared to UMLE, and the corresponding figures for 'CMLE filtered' versus UMLE are 15.8%, 13.8% and 20.3%.
Despite the fact that the squared bias makes a markedly small contribution to the total MSE (lighter colours in Figure 5), it is clearly higher for some constrained parameter estimates, specially for those associated to the second ordinal predictor. The CMLE associated to the sixth category of the second ordinal predictor produced the highest squared bias, which represents 15.0% of its total MSE (the same for 'CMLE Bonferroni' and 13.0% for 'CMLE filtered'). The squared bias of the remaining CMLEs associated to the second OP together with those related to the first OP and the intercepts represent, on average, 2.4% of the MSE, ranging from 0.2% to 6.5%. The corresponding figures for 'CMLE filtered' are 1.8%, 0.2%, and 5.0%. Consequently, the MSEs are composed mainly of variances, which are considerably lower not only for the parameters associated to the ordinal predictor categories, but also for the intercepts. In general, the total MSE for the CMLE is always significantly smaller because the squared bias is more than compensated by the much smaller variance.
In the previous simulation, no non-monotonic ordinal predictor was included and its results showed that any approach with CMLE performed better than the unconstrained one. In order to analyse their performance in presence of non-monotonic OPs, consider another simulation of model (4). This time we use an ordinal response with four categories, i.e., k = 4 and j = 1, 2, 3; four ordinal predictors (t = 4) with q 1 = 3, q 2 = 4, q 3 = 5, and q 4 = 6 categories correspondingly; one interval scale predictor (v = 1); and n = 500. The chosen parameters for the intercepts are α 1 = −1.4, α 2 = −0.1, and α 3 = 1.7; for OP 1 are β 1 = (0.5, 1); for OP 2 are β 2 = (−0.65, −0.70, −1.60); for OP 3 are β 3 = (0, 0, 0, 0); for OP 4 are β 4 = (−0.8, −1.6, −0.6, 0.6, 1.6); and for the interval scale predictor β 1 = 0.3. The parameters of the OPs 1 to 4 were chosen to be isotonic, antitonic, zero, and non-monotonic correspondingly. For OP 3, all the parameters were set to zero, and therefore the monotonicity test is expected to not reject monotonicity and the third step of the MDC procedure is expected to increase the number of models to be fitted because its second step should classify it as 'both'. Hence, OPs 3 and 4 contribute to challenge the MDC procedure and the monotonicity test respectively.
This model was fitted for 1,000 simulated data sets. The ordinal predic-tors were drawn from the population distributions showed in Figure 1. The interval scale predictor was randomly generated from a normal distribution  N (1, 4). In general, the MDC procedure was executed with a 90% confidence level in the first step (c = 0.90) and tolerance levelsc * s = 0.85 andc * s = 0.999 for s = 1, 2, 3, 4 in the second step. The monotonicity direction results for the three different approaches to the constrained estimation process are the following: • CMLE: This approach demands to impose monotonicity restrictions on all of the OPs. Therefore, their monotonicity directions were always established. OPs 1 and 2 were finally classified as 'isotonic' and 'antitonic' in 100.0% of the data sets. The OP 3 remained classified as 'both' in 64.0% of the data sets at the end of the second step. Therefore, the third step was completed and OP 3 was classified as 'isotonic' in 48.8% of the data sets and as 'antitonic' in the remaining 51.2%. Finally, the first step classified the OP 4 as 'none' in 96.8% of the data sets, and by the end of the whole procedure it was classified as 'isotonic' in 84.1% of the cases and as 'antitonic' in the remaining 15.9%.
• CMLE Bonferroni: This approach does not impose monotonicity constraints on those OPs that are non-monotonic according to the monotonicity test results. The null hypothesis of monotonicity was not rejected for OPs 1, 2 and 3 for all of the data sets, with α * 1 = α * 2 = α * 3 = 0.05. Regarding OP 4, the test did reject H 0 in 83.9% of the data sets with α * 4 = 0.05. Therefore, the MDC procedure was fully applied to determine the monotonicity direction of OPs 1 to 4 for 16.1% of the data sets, whereas it was not used to establish the monotonicity direction of OP 4 in the corresponding 83.9% as it was assumed to be non-monotonic. The final monotonicity direction classifications for the 1,000 data sets were 100% 'isotonic' for OPs 1; 100% 'antitonic' for OP 2; 48.2% 'isotonic' for OP 3 (51.8% 'antitonic'); and 15.1% 'isotonic', 1.0% 'antitonic' and 83.9% 'none' for OP 4.
• CMLE filtered: Like in the previous approach, this one imposes monotonicity constraints on those OPs that are assumed as monotonic only. According to the first step of the MDC procedure, the OPs 1 and 2 were compatible with monotonicity for 100% of the data sets. However, this step classified OPs 3 and 4 as 'none' in 0.2% and 96.8% of the data sets respectively. Thus, the final monotonicity direction classifications were 100% 'isotonic' for OPs 1; 100% 'antitonic' for OP 2; 47.7% 'isotonic', Figure 6: Unconstrained MLE, three methods based on constrained MLE and parameters used for 1,000 simulated data sets with 4 OPs.
52.1% 'antitonic' and 0.2% 'none' for OP 3; and 3.2% 'isotonic' and 96.8% 'none' for OP 4. Figure 6 shows the results of the unconstrained MLE and the three different approaches for the constrained MLE.
If the researcher is not open to the possibility of dropping monotonicity constraints and there is a non-monotonic ordinal predictor, such as OP 4 in this simulation, then a higher bias is the price of imposing monotonic effects in order to get, for example, easier interpretability for all of the OPs using the approach 'CMLE'. This was expected because, for all of the data sets, this approach forced the parameters of OP 4 to be monotonic but this is not its true pattern. This is why the CMLEs for OP 4 (blue boxplots in Figure  6) depart from the other approaches and the true parameters, which also produces a high bias in the parameters associated to the intercepts.
Assuming that the researcher is willing to drop monotonicity constraints depending on statistical evidence and given that there was a non-monotonic ordinal predictor, the approaches 'CMLE Bonferroni' and 'CMLE filtered' provided reasonable results. Their performances in terms of MSE depend on the degree of conservativeness when establishing the set of OPs with nonmonotonic effects. The more conservative, the higher MSE compared to the one of UMLE. In this case, the approach 'CMLE filtered' resulted to be the best option within the constrained ones. On average, it is the only approach that produced a lower MSE (1.5% lower) compared to the one of UMLE. The MSE of 'CMLE Bonferroni' was, on average, 33.1% higher than the one of unconstrained MLE, mainly because of OP 4, which was 98.7% higher. This is because the approach 'CMLE Bonferroni' favoured monotonicity of OP 4 for 16.1% of the data sets, whereas 'CMLE filtered' did this for 3.2% of the data sets only.

Application to quality of life assessment in Chile
As an illustration of the the proposed methodology, we analyse the association between a quality of life self-assessment variable (10-point Likert scale) and ordinal and other predictors. The data source is a Chilean survey, the National Socio-Economic Characterisation 2013 (CASEN, from its name in Spanish). This survey retrieves information with the aim of characterising the population of people and households. Our analysis is based on 7,374 householders corresponding to those who live in the capital and reported the quality of life self-assessment.
The survey provides information on several variables, from which the final set of covariates was chosen on the basis of previous research in the field (see for example Di Tella et al. (2003); Cheung and Lucas (2014); Boes and Winkelmann (2010)). The data set analysed in the current section was published by the Ministry of Social Development of Chile and it is available online at: http://observatorio.ministeriodesarrollosocial.gob.cl/ casen-multidimensional/casen/basedatos.php. In order to reproduce the data set used in this section, the steps involved in the data preprocessing are described in Appendix A. The response variable is a self assessment of the quality of life (QoL). Each respondent was asked to answer the question 'Considering everything, how satisfied are you with your life at this moment?'. The possible alternatives were: '1 Completely Unsatisfied', '2',. . ., '9', '10 Completely Satisfied'.
The model was fitted with ordinal, ratio and nominal scale covariates. For the ordinal and nominal scale ones, the first category to be mentioned is considered as the baseline. The ordinal covariates are Educational Level (Edu) with categories 'Not Educated', 'Primary', 'Secondary', and 'Higher'; Income Quintile (Inc) with levels from 'Q1' to 'Q5' where 'Q5' represents the highest income; Health Status (Hea), a health self-assessment reported as ordinal Likert scale from 1 to 7, with 7 being the best possible status; Overcrowding (Ove), which is an index representing the number of people living in the household per bedroom, with categories 'Not Overcrowded' for less than 2.5, Figure 7: CMLEs and UMLEs for a model applied on real data with an ordinal response, ordinal predictors and others. Intercepts parameter estimates omitted. The 95% confidence intervals correspond to the UMLEs.
Each set of parameter estimates associated to the ordinal predictors in S was classified as either 'antitonic' or 'isotonic'. The interpretation for the relationship between an ordinal predictor and the response variable with 'antitonic' pattern is that the further away an ordinal category is from its baseline, the smaller P (y i ≤ j|x i ) is, i.e., the probability of self-assessing QoL in the jth category or smaller. In other words, 'antitonic' patterns mean that higher categories of ordinal variables are associated to more probability of self-assessing QoL in a higher part of the scale. The inverse interpretation applies for 'isotonic' patterns.
An unconstrained version of the model (4), the proportional odds cumulative logit model, was fitted to obtain the parameter estimates and their standard errors. The unconstrained parameter estimates and their 95% confidence intervals are shown in Figure 7. These results seem to be consistent with the monotonicity assumption for all the OPs. Therefore, the assumption of monotonicity was imposed on all of them and the 'CMLE' approach was used.
The monotonicity directions were established by using the MDC procedure. With a 95% individual confidence level (c = 0.95), the MDC procedure classified the sets of parameters associated to three ordinal variables as 'antitonic' in its first step (Income Quintile, Health Status, and Children), whereas Overcrowding was classified as 'isotonic' and Educational Level as 'both'. The latter decision was made because all of its 95% individual CIs contain zero as shown in Figure 7. There was no ordinal predictor classified as 'none' by the end of the first step. Therefore, there was no need of making a decision on whether dropping the monotonicity constraints for those classified as 'none' or not. Hence, A 1 = {Inc, Hea, Chi}, I 1 = {Ove}, and Educational Level was the only variable in the MDC procedure's second step. To perform this step, a tolerance level of 0.9 was set together with steps of 1% when gradually decreasing the confidence level starting from the one analysed in step one, 95%. As a result, the Educational Level ordinal variable was classified as 'antitonic' at the 92% confidence level for each confidence interval.
There was no need to execute the third step of the MDC procedure because all of the monotonicity directions were established earlier. All the ordinal predictors were finally classified as 'antitonic' except for Overcrowding, which was classified as 'isotonic'. Therefore, only one model was fitted.
We also used the monotonicity test described in Section 4 as a complementary assessment of the monotonicity assumptions. Its results are consistent with the MDC procedure in the sense that it did not reject the null hypothesis of monotonicity for any of the ordinal predictors with α * = 0.05. Some of the parameter estimators resulting from the unconstrained estimation are not in line with the monotonicity assumption. The third UMLE of Income Quintile and the second one of Health Status are greater than their previous adjacent one, which is not compatible with their monotonicity direction because they both are assumed to be 'antitonic'. In the opposite direction, the same happens with the second UMLE of Overcrowding. For instance, keeping all the other variables constant, an improvement in the Income Quintile from 'Q3' to 'Q4', i.e., an increment in the income level, increases the probability of self-assessing QoL in lower categories of the scale, according to the UMLE. The same happens with Health Status, for which changes from '2' to '3', i.e., improving the health status, seemingly increases the probability of reporting a low self-assessment of QoL. Despite the fact that the true parameters are unknown, these particular unconstrained results are counterintuitive. Therefore, it is reasonable to think that these may have been the result of random variation, and to impose the monotonicity assumption here if we want to avoid misinterpretations.
For the OP Educational Level, the UMLE allows both positive and negative values in all confidence intervals, but after having classified this OP as antitonic, with the baseline parameter fixed at zero and using the CMLE, all further parameters can only be negative.
In general, the UMLEs are compatible with a monotonic association between ordinal predictors and the response variable. However, the parameter estimates produce some violations of monotonicity. The CMLEs avoid these, and allow for a simpler and more consistent interpretation.

Conclusions
We propose a constrained regression model for an ordinal response with ordinal predictors,which can involve other types of predictors. The information provided by the category ordering of the ordinal predictors is used appropriately for ordinal data, rather than ignoring it (assuming categories as nominal) or overstating it as interval.
Each set of parameters associated to an ordinal predictor's categories can be enforced to be monotonic in our procedure, which decides automatically whether associations are isotonic or antitonic. The monotonicity direction classification procedure can classify variables not only as isotonic or antitonic, but also as compatible with both monotonicity directions or none, and the researcher may sometimes prefer to leave out variables compatible with both direction and zero parameters, and to drop the monotonicity constraint for variables incompatible with either direction, which can easily be done within the framework presented here.
The MDC relies on the choice of a pre-specified range of confidence levels betweenc * s andc * s , but the regression model itself does not require a tuning parameter and does deliver monotonic parameter estimates, unlike the penalised version in Tutz and Gertheiss (2014), which pushes parameters in the direction of monotonicity but does not necessarily achieve it.
A monotonicity test is proposed to assess the validity of the monotonicity assumption for every ordinal predictor. This checks whether the set of confidence intervals belonging to the parameters of an ordinal predictor is compatible with monotonicity or not. As this is based on the Bonferroni correction of confidence levels, it can be expected to be very conservative, and more powerful tests can probably be developed. This is left to future work.
Three different approaches for the estimation method are proposed depending on whether the researcher wants to impose monotonicity constraints on all of the OPs or some subset of them. For the first case, the MDC proce-dure is fully applied, and for the second case, the two remaining approaches differ in the way they identify the subset of OPs on which the monotonicity assumption is not imposed, one uses the monotonicity test, and the other one executes a modified version of the MDC procedure.
The CMLE for the real data application proved to be a sensible solution because it enabled a consistent interpretation for the ordinal variables' categories, which would not have been the case for the UMLE.
Asymptotic theory for the CMLE is a matter of ongoing research. This would enable us to make inference about the parameters in the constrained model.
We intend to produce one single R package containing the implementation of the constrained regression model for ordinal predictors discussed in Section 2; the MDC procedure with access to the results from each one of its steps (Section 3); and the monotonicity test discussed in Section 4.