1 Introduction

Preference mapping is a collection of multivariate statistical techniques aiming to analyse consumer acceptance of food and beverage products (Meilgaard et al. 1999; McEwan 1996; Greenhoff and MacFie 1994; Guinard et al. 2001). The two most used methods are internal preference mapping and external preference mapping (Meullenet et al. 2008; Næs et al. 2010). The former uses consumer acceptance ratings to determine a low-dimensional representation of products and consumers and then relates the sensory characteristics to the axes of this space. The latter uses descriptive sensory attribute ratings to obtain a low-dimensional representation of products and then relates the consumer ratings individually to this space. Both methods are crucial to the food and beverage industries in order to understand which sensory characteristics drive consumer acceptance of goods. This information is used by marketing and R&D divisions to adapt existing products or create new products that meet consumers’ expectations and needs.

In this paper, the focus will be on external preference mapping. Here, the most common method is based on a two-step procedure that combines principal component analysis (PCA) and least squares regression (LSR) (Næs et al. 2010). This method is often called PREFMAP. In the first step, a perceptual map of the products is obtained through a PCA of the product-by-attribute sensory matrix. The principal components (generally the first two) obtained from the analysis are called key sensory dimensions (Meilgaard et al. 1999). The perceptual map corresponds to the PCA biplot (Gower and Hand 1995), but one can also explore the plots of products and variables separately to get the same information. In the second step, a regression model is used to fit data from each consumer separately to the perceptual space, using the consumer ratings as response and the key sensory dimensions as regressors. This provides a plot highlighting the directions of preferences for all the individual consumers. This is a crucial plot because it allows to focus on general tendencies and not on individual consumers. It enables understanding if the market is homogenous, i.e. all consumers show the same preferences for the same products, or heterogeneous, i.e. groups of consumers show different tastes. The behaviour of the individual consumer is never relevant if not related to all the others. The main assumption behind this approach is that the preference for each consumer depends linearly on the sensory attributes. There exist alternative approaches named ideal point models (Næs et al. 2010) that consider also nonlinear relations, but this paper focuses on linear models.

Since PREFMAP uses LSR, it summarises the effect of the sensory dimensions on the conditional average of the consumer ratings. In other words, the classical regression model explores how the variation of the predictors affects the conditional mean of the response variable. This is a crucial point motivating the present work: the average is certainly a valid synthesis of the distribution, but it may also be interesting to evaluate the effect of the sensory dimensions on the rest of the distribution. To this end, quantile regression (QR) (Koenker 2005) can be used to provide an estimate of conditional quantiles of the dependent variable, here consumer acceptance ratings.

QR was recently used in consumer studies for relating liking to consumer factors (Davino et al. 2015), and for handling consumer heterogeneity (Davino et al. 2018, 2020). This study introduces the use of QR to PREFMAP in order to provide additional information beyond the classical average effect that the sensory dimensions exert on the consumer preference. Specifically, the variability around the regression function and in the direction of preference will be analysed through a twofold application of QR. The main objective here is to obtain a plot in the same style as for standard preference mapping to represent information linked not only to the average consumer preference but also to other aspects of the liking distribution. This is accomplished by plotting and interpreting the regression coefficients from the QR themselves, focusing in this way on variability along the two principal components (PC) directions. Moreover, QR will be used to discriminate between consumers that are more or less stable in their liking for the best and least liked samples. This approach leads to a more comprehensive interpretation of the consumer liking data. It is essential to underline that the present work will also show many individual-oriented plots. A major reason for this is that they are essential for clarifying the additional information provided by QR in this context. They may also in practice supply a more detailed interpretation or visualisation of the liking pattern for individual consumers. However, the objective of the classic preference maps, as well as the preference maps based on QR proposed here, is focused on the comparative analysis of all consumers. Consequently, it is strongly advised to give greater importance to the plots showing all consumers simultaneously, using the individual-oriented ones as interpretative aids in the analysis of the general tendency.

The paper is structured as follows: Sect. 2 presents the methodological framework, i.e. the classical approach to PREFMAP based on LSR along with a short introduction to the underlying idea behind QR and the main keys to interpret its results. The new approach to PREFMAP based on QR is described in Sect. 3. After introducing a case study based on consumer liking of yogurts (Sect. 4), results of the proposed methods are shown in Sect. 5. Finally, some concluding remarks and guidelines for future research are given in Sect. 6.

2 Methods

2.1 External preference mapping

As discussed above, PREFMAP aims to relate the product-by-attribute descriptive sensory data matrix and the product-by-liking preference matrix obtained using consumers. This is achieved through a two-step procedure that first uses the PCA to reduce the sensory data matrix into a number of sensory dimensions, and then projects the preference matrix onto this subspace.

Let X be the sensory matrix (\(I \times K\)), where the entry \(x_{ik}\) is the measured value (usually represented as intensity) of sensory attribute k (\(k = 1, \ldots , K\)) for product i (\(i = 1, \ldots , I\)). Note that we only consider here descriptive sensory data averaged over the panel of trained assessors. Throughout the paper, X will be assumed column centred.

The PCA is first used for modelling the sensory characteristics, i.e.

$$\begin{aligned} {\varvec{X}}={\varvec{T}}{\varvec{P}}^T+{\varvec{E}}, \end{aligned}$$
(1)

where T is the matrix (\(I \times A\)) of the principal component scores (used in the analysis) and P is the matrix (\(K \times A\)) of the loading values that define the contribution of each of the original sensory variables in the computation of the principal components. The symbol A refers to the number of components included in the model, with \(A \le {\text{rank}}({\varvec{X}}) \le {\text{min}} (I-1, K)\), and the matrix E representing the components not used, often interpreted as random noise (Jolliffe 1986). The scores and the loadings will be separately visualised through specific scatter plots, generally named scores plot and loadings plot, or combined in one single plot named biplot (Gower and Hand 1995).

Let Y be the matrix of liking values (\(I \times J\)), where the entry \(y_{ik}\) is the measured liking value of product i and consumer j (\(j = 1, \ldots , J\)). Also, the Y matrix is generally column centred (across products) in order to remove a different use of the scales among the consumers (Romano et al. 2008). In some cases, the Y matrix is synthesised into one or a few variables, which then correspond to the mean liking scores by product across all consumers or across consumers who belong to a homogeneous cluster. The rest of the paper will consider the whole matrix Y, i.e. liking will be analysed at the individual level.

In PREFMAP, the liking values for each consumer are regressed onto the first sensory dimensions, most often the first two PCs (i.e. A = 2). The most used model for this is the multivariate linear regression model

$$\begin{aligned} {\varvec{Y}}={\varvec{TB}}+{\varvec{F}}, \end{aligned}$$
(2)

where T comes from the PCA model, B represents the regression coefficients (also called consumer loadings), and F represents the residuals. For each single consumer (j), the model based on the first two principal components can be written as

$$\begin{aligned} y_{ij}= \beta _{j1} t_{i1}+\beta _{j2} t_{i2}+f_{ij} \end{aligned}$$
(3)

The coefficients of this regression model describe how the first two sensory dimensions are related to the consumers’ liking scores. The model does not include the intercept as the variables are centred. Since a linear model is assumed, an increase in the T sensory scores will produce either an increase or decrease of the consumer liking depending on the sign of the coefficient. The estimated regression coefficients \(\hat{\beta _1}\) and \(\hat{\beta _2}\) are plotted in the same way as the loadings for the PCA. Note that the two-step procedure discussed above corresponds to a specific multivariate linear regression method known as principal component regression (PCR) (Næs et al. 2002). In summary, this two-step procedure provides a scores plot from the sensory analysis, a loadings plot from the sensory analysis and a plot of the regression coefficients from the LSR of liking for the sensory dimensions.

Note that the use of the linear model (3) involves the fitting of a plane to the data in the (\(t_1\), \(t_2\), y) coordinates. This means that the liking is assumed to be the same along contours orthogonal to the direction of steepest ascent, which is the direction indicated by the regression coefficients plotted in the PREFMAP plot.

It is worth noticing that the main focus in the rest of the paper will be on comparing regression coefficients in different models (LSR and QR). It is thus advisable to consider standardised scores to remove scale effect. The model (3) after standardisation of the scores with respect to the corresponding standard deviation (sd) can be formulated as

$$\begin{aligned} y_{ij}= \beta _{j1} \frac{t_{i1}}{sd(t_{1})} + \beta _{j2}\frac{t_{i2}}{sd(t_{2})} + f_{ij} = \beta _{j1} u_{i1}+\beta _{j2} u_{i2} + f_{ij}, \end{aligned}$$
(4)

This means that we switch the standardisation to unit variance from PC loadings to PC scores. This is not standard in PREFMAP, but it is reasonable here since we put emphasise on comparison among coefficients.

2.2 Quantile regression

Quantile regression (QR), as introduced in Koenker and Bassett (1978), offers a possible approach for modelling the whole conditional distribution of a dependent quantitative variable. According to Equation (4), the QR model, connecting the liking of each single consumer to the standardised principal components \(\varvec {u}_{i}\), is estimated for different quantiles \(\theta \in [0,1]\)

$$\begin{aligned} y_{i}=\varvec {u}_{i}^{T} {\varvec{\beta }}\left( \theta \right) +\epsilon _{i}\left( \theta \right) ,\end{aligned}$$
(5)

where the first element of \({\varvec{u}}_i\) is equal to 1 to allow for an intercept, and the \(y_{i}\) corresponds to the liking score of each consumer for the different samples i. It is worth of notice that the \(\theta \)–th conditional quantile of the error term is equal to zero, while no parametric assumption for the error (and hence response) distribution is required.

Specifically, the conditional quantile estimator (Koenker and Bassett 1978) is

$$\begin{aligned} {\varvec{\hat{\beta }}}\left( \theta \right) ={\text {argmin}}_{{\varvec{\beta }}(\theta )} \sum _{i=1}^n \rho _{\theta } \left( y_i-\varvec {u}_{i}^{T} {\varvec{\beta }}\left( \theta \right) \right) , \end{aligned}$$
(6)

where \(\rho _{\theta } \left( .\right) \) is the check function, which asymmetrically weights positive and negative residuals

$$\begin{aligned} \rho _{\theta }(v)= \left\{ \begin{array}{lll} {\theta v} &{} \text{ if } &{} {v > 0} \\ {\left( \theta - 1 \right) v} &{} \text{ if } &{} {v\le 0} \end{array} \right. \end{aligned}$$
(7)

Equation (6) provides a quantile regression line for each conditional quantile of interest. Even if it is potentially possible to estimate an infinite number of conditional quantiles, only a finite number of distinct solutions exists, depending on sample size and data structure (Davino et al. 2013; Furno and Vistocco 2018). In practice, it is quite common that the researcher defines the quantiles of interest which, in most cases, are the three quartiles, \(\theta =[0.25, 0.50, 0.75]\), along with two extreme quantiles, typically \(\theta =[0.10, 0.90]\). For each quantile of interest, a regression model is estimated, and a set of coefficients are obtained. In the case of a simple regression, the single QR lines can be visualised in the typical plot displaying the response variable on the vertical axis and the regressor on the horizontal axis. The QR coefficients are interpreted in the same way as the LSR coefficients, i.e. as rates of change. QR coefficients focus on the effect of each regressor on the \(\theta \)-th quantile of the response, while LSR explores the same effect on the conditional mean.

In the preference mapping context, QR is used for each consumer separately. This means that the response variable corresponds to the liking scores given by each consumer for the I products, while the predictors are the sensory dimensions obtained from PCA. Thus, QR coefficients describe the effects of the selected sensory dimensions (the PCA components) on the least (low \(\theta \)) or the most (high \(\theta \)) liked products for each single consumer. For instance, a very high value of \({\hat{\beta }}_{j1}(\theta =0.25)\) means that an increase in the sensory variables highly correlated with the first component, would involve an increase in the lower part of the distribution around the regression line as well. This means, for instance, that the liking for the least liked samples to the left and to the right in the PCA plot also increases.

Although different functional forms can be used, this paper is limited to linear regression models in line with the classical PREFMAP approach. The estimation of the regression quantiles typically exploits linear programming (Koenker and D’Orey 1987). Interior-point methods (Koenker 2000) and Bayesian approaches (Yu and Moyeed 2001) have also been proposed. The interested reader is referred to Furno and Vistocco (2018) for further details. QR estimators are asymptotically normal distributed, the forms of the covariance matrix depending on the model assumptions (Koenker and Bassett 1978; Koenker and Basset 1982a, b). Resampling methods can represent a valid alternative to the asymptotic inference; they allow to estimate the standard errors of the parameters without requiring any assumption for the error distribution (Gould 1993). For a review on QR resampling methods, see (Kocherginsky et al. 2005). Further discussion of QR is beyond the scope of this paper, but the interested reader may consult (Koenker 2005; Davino et al. 2013; Furno and Vistocco 2018).

3 Strategies for using QR in PREFMAP with different perspectives

As discussed in Sect. 1, the purpose of the paper is to integrate QR in the context of PREFMAP in order to enrich the results of traditional preference maps. This paper proposes two different strategies based on QR to enhance a PREFMAP. Both strategies are introduced to highlight possible different consumer behaviour with respect to the less or most preferred products. Figure 1 summarises the steps of the proposed procedure, from data to final results.

Fig. 1
figure 1

Description of data structure and graphical tools in PREFMAP and QR-PREFMAP

The starting point is the product-by-attribute and product-by-liking matrices. Once the principal component scores (usually two components) are extracted from the sensory matrix, the first strategy explores the variability of liking along the principal components for each consumer. This goal is pursued by extending the classic approach for the construction of preference maps (using LSR) to the exploration of effects for different levels of liking (using QR). The interpretation of PREFMAP will be complemented by the information provided by the corresponding maps to different quantiles (QR consumer loadings plots).

The second strategy focuses on the direction of preference for each single consumer and uses the QR at two different quantiles to distinguish between consumers who converge or diverge in their preference pattern along this direction. The additional information deriving from the use of QR will be represented on ad-hoc QR consumer loadings plot.

3.1 Analysis based on quantile regression coefficients

In PREFMAP, least squares regression allows to estimate the relationship between the liking and the sensory dimensions for each consumer. A visual comparison of all the estimated coefficients (consumer loadings plot) provides an overall view of the preference directions and highlights how many consumers would be influenced by possible changes in the sensory characteristics.

Our proposal is based on estimating selected QR models for each consumer, \(\theta =[0.25, 0.50, 0.75]\), in addition to the LSR model. If the first two components are retained, the equation used for QR can be written as

$$\begin{aligned} y_{ij}=\beta _{j0}\left( \theta \right) +\beta _{j1}\left( \theta \right) u_{i1}+\beta _{j2}\left( \theta \right) u_{i2}+g_{ij}\left( \theta \right) \end{aligned}$$
(8)

where \((0< \theta < 1)\). Note that as opposed to the standard LSR approach based on model (3), an intercept is needed in the QR.

In the following, we will consider two different ways of visualising the coefficients.

  1. 1

    For each consumer, the introduction of QR in PREFMAP provides a set of coefficients for each quantile of interest. They measure information about possible differences in variability of liking for the sensory region for the most and least liked samples. Note that a consumer with similar coefficients for two different quantiles (for instance 0.25 and 0.75) is a consumer with a stable variability around the regression line/plane. On the other hand, large differences indicate more variability either for the least or best liked samples (Appendix A2 reports the graphical representation generally used to interpret the coefficients of a quantile regression). In the preference mapping approach, the QR consumer loadings plot allows representing all the regression coefficients of the various consumers. However, it is possible to have as many coefficients as there are quantiles, even if the two quartiles are generally used. Therefore, the QR−PREFMAP approach focuses on the interpretation of the two QR consumer loadings plots at \(\theta = 0.25\) and \(\theta = 0.75\), respectively.

  2. 2

    An alternative joint representation for all consumers is also introduced. It enables a simultaneous interpretation of results related to two different quantiles (e.g. \(\theta = 0.25\) and \(\theta = 0.75\)). The j-th consumer is represented through two points (\({\varvec{p}}_j^{\theta _1}\) and \({\varvec{p}}_j^{\theta _2}\)) based on the coefficients estimated at the two selected quantiles

    • coordinates for \({\varvec{p}}_j^{\theta _1}\) are \(\hat{\beta }_{j1}\left( \theta _1\right) \) and \(\hat{\beta }_{j2}\left( \theta _1\right) \)

    • coordinates for \({\varvec{p}}_j^{\theta _2}\) are \(\hat{\beta }_{j1}\left( \theta _2\right) \) and \(\hat{\beta }_{j2}\left( \theta _2\right) \)

In the plot, the two points (\({\varvec{p}}_j^{\theta _1}\) and \({\varvec{p}}_j^{\theta _2}\)) will be linked by an arrow starting from the lowest and ending in the highest quantile. The length of such a line is a measure of how much the variability of liking varies for different groups of samples (points in the PCA scores plot). It can be easily computed using a distance measure (e.g. the Euclidean distance between \({\varvec{p}}_j^{\theta _1}\) and \({\varvec{p}}_j^{\theta _2}\)). The position and the direction of each arrow highlight similar or different liking tendencies (for the different consumers) for the two components and at the different quantiles. For example, if the j-th consumer is represented by an arrow located in the first quadrant, and the arrow is directed towards the positive direction of both axes, this means that both quantiles on both components are positive and that both increase as functions of the components. In other words, both the lower and upper parts of the distribution go in a positive direction as the components go in the positive direction. On the other hand, arrows that cross between two quadrants mean that the upper and lower parts of the distribution have a different development as a function of the principal components. Concrete examples will be given in the case study section (Sect. 4, Figs. 5 and 7).

Further interpretation tools are also introduced to combine LSR and QR results as will be shown later in the case study (Fig. 8).

3.2 Analysis based on direction of preference

A complementary analysis can be carried out considering the direction of the most liked samples for each consumer. Such a direction, defined by the LSR coefficients \(\beta _1\) and \(\beta _2\) in equation (3), can be exploited to evaluate if the distribution of liking is wider or narrower in the direction of increased liking.

The procedure is structured in the following points:

  1. 1

    The liking values for each consumer are regressed onto the first two sensory dimensions (as described in Sect. 2.1):

    $$\begin{aligned} y_{ij}=\beta _{j1}u_{i1}+\beta _{j2}u_{i2}+f_{ij} \end{aligned}$$
    (9)

    The estimates \(\hat{{\varvec{\beta }}}_j=\left( \hat{\beta }_{j1}, \hat{\beta }_{j2} \right) \) define the direction of the steepest ascent in the principal component plane.

  2. 2

    The scores for each sample i are then projected onto the direction of steepest ascent (in the \(u_1\), \(u_2\) space):

    $$\begin{aligned} \begin{array}{lll} s_{ij}=\left( \hat{{\varvec{\beta }}}_j^T \hat{{\varvec{\beta }}}_j\right) ^{-1}\hat{{\varvec{\beta }}}_j^T \varvec {u}_i&\text{ where }&\varvec {u}_i=\left( \varvec {u}_{i1}, \varvec {u}_{i2} \right) \end{array} \end{aligned}$$
    (10)

    It can be shown that the LSR solution obtained by regressing y onto s has the same slope as the slope of the regression plane in the direction of the steepest ascent (see the Appendix A1 for the proof). If the vector \({\varvec{\hat{\beta }}}_j\) is previously normalised, the measurement scale of s will be comparable to the unit in the principal component scores space.

  3. 3

    Two QR models are estimated for \(\theta =0.25\) and \(\theta =0.75\)

    $$\begin{aligned} \hat{y}_{ij}\left( \theta \right) =\hat{\beta }_{j0}\left( \theta \right) +\hat{\beta }_j\left( \theta \right) s_{ij} \end{aligned}$$
    (11)

    For each consumer, the two QR lines can diverge or converge as a function of s, thus providing information on the variability of the liking along the direction of steepest ascent. Consumers can then be classified according to whether the variability is larger for the most liked area in the sensory space than for the least liked samples. This information can be achieved by fixing two values s1 and s2 on the s-scale (\(s1<s2\)) and comparing the distance between the two QR lines in s1 and s2. If for instance the distance between the two QR lines is greater in s2 than in s1, this means that liking range is larger in the direction of the most liked samples. This indicates that the consumer has a more “flexible” pattern for the most liked samples. The same consumer has a small range of liking values in the other end of the liking scale. Note that consumers more “flexible” for the most liked products, later will be called diverging consumers, as opposed to the converging consumers that are more “flexible” for the least liked products. An example of different types of consumers is given below in Figs. 10 and 11 (please, refer to next Section for guidelines).

4 Case study

The data set used for the illustration of the method is based on a set of yogurts with the same calories and the same composition but with different consistency. Specifically, 8 samples were obtained from a \(2^3\) experimental design in which the three design factors were viscosity (thin/thick), particle size (flake/flavour) and flavour intensity (low/optimal). The details of the experiment can be found in Nguyen et al. (2018). Table 1 provides a description of the samples according to the experimental factors. The product labels in brackets refer to the different possible combinations of the levels of the three factors (t=thin, T=thick, f=flour, F=flakes, l=low, o=optimal).

Table 1 Description of the yogurt samples according to the design factors. The details of the experiment can be found in Nguyen et al. (2018)

The samples were profiled by a sensory panel consisting of 10 trained assessors using standard quantitative descriptive analysis (QDA) (Lawless and Heymann 2010). It is a highly trained panel that is assessed frequently on essential qualities such as discrimination, repeatability and agreement. A list of totally 21 attributes was used: six odour attributes (intensity, acidic, vanilla, stale, sickening, oxidised), three taste attributes (Sweet, Acidic, Bitter), six flavour attributes (intensity, sour, vanilla, stale, sickening, oxidised) and six texture attributes (thick, full, gritty, sandy, dry, astringent). The complete list of sensory attributes and labels used in the analysis is shown in Table 2. Samples were served in plastic containers coded with 3-digit random numbers and in a sequential monadic manner following a balanced presentation order.

The same samples were evaluated by a consumer panel consisting of 101 consumers. The consumer group was split in subgroups, and within each subgroup, the order of (blind) tasting was the same. The design for the splitting was developed for balancing the first and higher orders of carry-over effect. The yogurts were rated for the degree of liking on a scale from 0=dislike extremely to 100=like extremely.

Table 2 Description of the sensory attributes

5 Results

The least squares PREFMAP and the quantile regressions were performed using the FactoMiner (Lê et al. 2008) and the quantreg (Koenker 2021) packages, available in the R statistical software (R Core Team 2020). Graphical representations were obtained through ad-hoc R code.

5.1 Least squares approach to PREFMAP

Results from the PCA on the correlation matrix for the yogurt data are shown in Figs. 2 and 3. The scores plot is shown in Fig. 2. It can be seen that the first two components explain most of the variation (82.5%). Along the first component one can notice a clear distinction between the flour-added samples on the right side (tfl, Tfl, tfo, Tfo) and the flake-added samples on the left side (tFl, TFl, tFo, TFo). According to the sensory loadings shown in Fig. 3, the first group of products is characterised by oxidised and stale flavour, a bitter taste and sandy, dry and astringent texture. On the other hand, the flake-added samples present a gritty texture, intensity, vanilla and sour flavour, and intensity, vanilla and acidic odour. The second component is mostly related to distinguishing product 7 (tfo), characterised by sickening odour and flavour, from product 2 (TFl) characterised by fullness and thickness.

Fig. 2
figure 2

Sensory scores. The description of products can be found in Table 1

Fig. 3
figure 3

Sensory loadings. The list of sensory variables can be found in Table 2

The consumer loadings plot in Fig. 4 shows the regression coefficients indicating the direction of preference for each consumer. Here, the product 7 (tfo), which in Fig. 2, was placed along the positive side of the second component, can be considered the least preferred product, since most of the consumers are on the opposite side, that is, they lie along the lower part (towards the negative side) of the second component in the consumer loadings plot. Apart from that, there are consumers who like products in all the four quadrants. Note that consumers are marked according to quadrants and some of them (C1, C35, C47, C57, C75, C87) are highlighted since they show different patterns regarding the quantiles, as will be described below.

Fig. 4
figure 4

Consumer loadings (regression coefficients) marked according to quadrants (the same symbols will be used for labelling the points in Fig. 5). The marked consumers are selected because they show different patterns regarding the quantiles as will be discussed further below

5.2 Quantile regression approach to PREFMAP

5.2.1 Analysis based on regression coefficients

The procedure described in Sect. 3.1 is applied for the yogurt data set considering two quantiles of interest (\(\theta = [0.25, 0.75]\)). This means that, for each consumer, the liking values are regressed onto the sensory dimensions at the two selected quantiles and the set of corresponding coefficients is estimated.


Visualising change of quadrant as compared to PREFMAP


In Fig. 5, two preference maps are visualised, representing the coefficients estimated at the two quantiles of interest 0.25 and 0.75, respectively. Consumers are labelled with different symbols according to the original position in the least squares’ preference map.

To a large extent, the two plots resemble the standard PREFMAP plot, showing that the structure of the quantiles is not so different from the structure of the least squares estimates. It can, however, be seen that, at least a moderate number of consumers in the lower part of the plots are spread out to other quadrants for both \(\theta =0.25\) and \(\theta =0.75\). From the standard PREFMAP above, we know that this is the area/direction with the highest frequency of consumers. Consumer C1 changes quadrant for \(\theta =0.25\), moving from the first quadrant in the LS map to the third quadrant in the QR map. Consumers C35 and C57 (Fig. 5) are among those who change quadrant for \(\theta =0.75\). As can be seen, however, only the second PC is interested by the change. For consumer C47, who also changes quadrant, both axes are involved, but also in this case, the second component is the most important.

The interpretation of the positions of the consumers among the LSR-PREFMAP and QR-PREFMAPs must consider eventual noise. To this end, the standard errors of the QR coefficients are a valuable source to check if QR estimates provide meaningful information compared to the standard PREFMAP. The standard errors of the two QR coefficients obtained through the bootstrap xy-pair (Buchinsky 1995) method are represented in Fig. 6. In particular, the figure at the top refers to the standard error of the QR coefficients at quantile 0.25, and the figure at the bottom at quantile 0.75. In order to allow an easier interpretation of the results, also visual, in Fig. 6 only the labels of the consumers to be warned are shown, i.e. those highlighted also in the previous plots (Figs. 4 and 5). Concerning the standard errors at \(\theta =0.25\) (top part of Fig. 6), it is worth noting that there is a group of consumers (C1, C47 and C57) with very high standard errors both for the first and the second principal components. In contrast, C35 has a high variability just in the second direction. The same rationale can be followed to interpret the results at \(\theta =0.75\) (bottom part of Fig. 6). That means that the interpretation of the informative contribution provided by the quantile regression for C47 and C57 consumers must be done with caution because, although these consumers show at quantile 0.75 a differentiated behaviour with respect to the LS estimate (in the bottom panel of Fig. 5 they have positive coefficients for both components), they present rather high standard errors especially for β1 (bottom panel of Fig. 6). It is worth noting that the standard errors are high for almost all coefficients relative to the value of the corresponding coefficients. This result suggests new research developments in the uncertainty analysis of results in quantile preference mapping and classical ones.

Fig. 5
figure 5

Preference maps with QR estimates at the quantile \(\theta =0.25\) (top), and \(\theta =0.75\) (bottom)

Fig. 6
figure 6

Bootstrapped standard errors at the quantile \(\theta =0.25\) (top), and \(\theta =0.75\) (bottom). The standard errors of the two coefficients \(\hat{\beta }_{1}\) and \(\hat{\beta }_{2}\) are depicted on the two axes


Visualising paths of consumers’ preferences


Figure 7 simultaneously visualises, for a subset of consumers, the two quantile coefficients. The 16 consumers have been selected because they show a different behaviour with respect to changes of quadrant. Each consumer is represented using \(\hat{\beta }_1\) on the horizontal axis and \(\hat{\beta }_2\) on the vertical axis, the length of each arrow representing the distance between QR estimates at the 0.25 and 0.75 quantiles (see Sect. 3.1 point 2, for further details). As stated below, the two points representing each consumer are linked by an arrow in the direction from \(\theta =0.25\) to \(\theta =0.75\). Arrows crossing between two quadrants represent consumers with different signs at \(\theta =0.25\) and \(\theta =0.75\). It can be seen that consumer C57 has a negative \(\theta =0.25\) coefficient for the second component, as previously discussed, and that this becomes slightly positive for \(\theta =0.75\). Note that the inclination of the arrow also provides important information. For example, the fact that the change for C57 is almost parallel to the second component means that the coefficients at the two quantiles for the first component are almost equal for this consumer (as also shown in Fig. 12). However, in this case, the information on the inclination has to be considered with caution given the high uncertainty on \(\hat{\beta }_{1}\) at \(\theta =0.75\) for this consumer (see Fig. 6). Focusing on the second component, C57 crosses between two quadrants, which means that she/he changes position on the map, as also shown in Fig. 5 (bottom). The length of each arrow is still informative: consumers with a short line between the two quantiles are consumers with a uniform liking pattern (similar variability around the regression plane) over the sensory space.

Fig. 7
figure 7

QR loadings plot focusing on the two quantiles 0.25 and 0.75. Each arrow starts at \(\hat{\beta }_{\text{a}}(0.25)\) and ends at \(\hat{\beta }_{\text{a}}(0.75)\), \(a=(1,2)\). The length and the direction of each arrow provide information about the pattern of consumer preference for her/his less and most liked products. A subset of consumers have been selected due to their peculiarity


The PREFMAP plot equipped with information about quantiles


Figure 8 combines the standard PREFMAP with information about the quantiles. The points are now labelled with different symbols according to the effect of each component at the two quantiles (0.25 and 0.75). The consumers that have the same sign for the coefficients estimated at \(\theta = 0.25\) and \(\theta = 0.75\) on both the axes are here called concordant and are represented by a triangle. Discordant consumers, that is consumers having regression coefficients of opposite sign on at least one component for a selected \(\theta \), are represented by arrows crossing between two quadrants in Fig. 7 and here depicted through circles. Consumers with a short line between the points are consumers with a relatively uniform pattern of the distribution around the regression plane. The dimension of the points in Fig. 8 is proportional to the length of each arrow in Fig. 7.

We can note that many consumers have coefficients of the same sign with respect to both PCs for the two quantiles (see for instance C87 discussed in Fig. 12, in Appendix A2). On the other hand, C57 (also discussed in Fig. 12) is represented with a circle, which suggests that she/he has different sign of coefficients for at least one component. The circle is large due to the length of the arrow for C57 in Fig. 7. A certain tendency can be spotted in the plot: most of discordant consumers are concentrated towards the middle of the map, while most of concordant consumers are located towards the outer parts of the plot. In other words, for consumers with the stronger liking structure (far from the centre), the upper and lower quantiles follow a similar pattern.

Fig. 8
figure 8

QR loadings plot combining LSR and QR results. The size of the points is proportional to the length of the corresponding arrows in Fig. 7, while the different shape corresponds to the consistency of the coefficients: circles for the 55 discordant consumers, triangles for the 46 concordant consumers

As stated before, the length of each arrow (size of points in Fig. 8) can be considered as a measure of the variability of liking patterns moving from less to most preferred products. Figure 9 shows the distribution of such measure both for concordant (top panel) and discordant (bottom panel) consumers. Concordant consumers show a more uniform liking pattern (similar coefficients on different \(\theta \)’s for both components) as compared to discordant consumers.

Fig. 9
figure 9

Density plot of the degree of concordance/discordance of consumers

Table 3 shows the same information of Fig. 7 but for the whole sample of consumers. The rows and columns correspond to the quadrant position of each consumers in the QR loadings plot in Fig. 5 top and bottom, respectively. Focusing on the values in brackets (conjoint frequency), the concordant consumers are on the diagonal. They are those consumers who are represented by an arrow lying in one quadrant without crossing between different quadrants. The row percentages allow to appreciate the movement of consumers between quadrants. For instance, the value \(50\%\) in the first cell highlight concordant consumer on the first quadrant, while the remaining values on the first row present the estimated QR coefficients at \(\theta =0.75\) in the second (\(7\%\)), third (\(21.4\%\)) and fourth (\(21.4\%\)) quadrant.

Table 3 Distribution of the consumers with respect to their quadrant position in PREFMAP with QR estimates at \(\theta =0.25\) (rows) and \(\theta =0.75\) (columns)

5.2.2 Analysis based on direction of preference

The second proposed approach examines variability in the direction of preferences (s in Formula 10) obtained from the consumer loadings. Figure 10 shows, for three consumers C57, C75 and C87, the quantile regression lines at \(\theta =\left[ 0.25, 0.75\right] \) (see Formula 11) relating y and s for the 8 products. These three consumers have been selected because they show different tendencies. Consumer C87 presents diverging lines in the direction of preference, meaning that the variability of the liking is higher in that direction. This means that even in the area of highest liking there is still some variability. For the least liked area, the liking is more consistent. The opposite holds for consumer C57, while consumer C75 presents the case of a stable liking (uniform liking pattern). In order to measure the degree of such variability, the distance between fitted values at \(\theta =0.25\) and \(\theta =0.75\) has been computed at two fixed values s1 and s2 for the regressor variables (see Sect. 3.2). Specifically, s1 and s2 correspond to the first and third quartiles of the \(\varvec {s}\) regressor. In case the two distances between the fitted values differ not more than a fixed threshold the lines are considered parallel (the choice of the threshold is data driven). Based on this, we decided to consider three consumer categories: parallel, diverging and converging.

Fig. 10
figure 10

A consumer with a diverging (C87), converging (C57) and parallel (C75) liking in the direction of most liked samples. Centred data, therefore negative liking for some samples

Preference map with additional information about quantiles.

Finally, Fig. 11 depicts again the consumer loadings plot from standard PREFMAP (see Fig. 4), but setting now the size of the points proportional to the variability measure previously computed (based on two fixed values of \(\varvec {s}\), one low and one high) and their shape related to the distribution around the regression line in the direction of preference (converging, diverging, parallel).

Figure 11 highlights a relatively clear tendency of higher convergence to the left and divergence to the right. In other words, for the sensory region represented by samples P3 (tfl), P4 (Tfl) and P8 (Tfo) in Fig. 2, the liking is more “flexible” than in the opposite direction. The best liked samples to the right are liked even more than the average from PREFMAP sharing that the least liked samples are liked less than the average indicates. Quadrant 2 is totally dominated by converging consumers. With the exception of a few, the parallel consumers seem to be quite centrally positioned, i.e. most of them are consumers with low or moderately strong preference pattern (coefficients moderately large).

Fig. 11
figure 11

A variant of the consumer loadings plot: the size of the points is proportional to the QR variability measure computed for each consumer through the distances between the two QR lines in Fig. 10. Consumers are classified in converging, diverging or parallel according to the value of such measure

6 Concluding remarks and further developments

The proposed procedure is a direct extension of standard external preference mapping and as such the researcher can base her/his interpretation on tools very similar to what she/he is used to. Among other things, it has been shown how additional information can be obtained by equipping the standard PREFMAP with points in different colour and/or shape for highlighting various aspects related to the structure of the variability around the PREFMAP regression plane. Plots for the whole consumer group and plots for individual consumers have been presented. In particular, we recommend Figs. 8 and 11 for this purpose. For greater completeness, both figures could be filtered, removing consumers with a high degree of uncertainty according to the values of the standard errors of QR coefficients (Fig. 6). As discussed in Sect. 1, preference mapping should provide a comparative analysis of consumer preference patterns. Nevertheless, individual-oriented graphics can be obtained to highlight the additional information provided by QR for each consumer.

An interesting possibility, not explored here, could be to relate the structures found in Figs. 8 and 11 to additional information about consumers, whenever available. As an example, information about attitudes, habits or demographics.

The tools presented in this paper enable reliable conclusions when a reasonable number of samples in the preference test is available. This happens in standard preference maps as in QR maps and underlines the importance of considering an overall perspective in interpreting the results. In this case, only eight samples were used, a relatively small number of products. However, in consumer studies dealing with food, this is a typical situation. Indeed, in most studies of sensory quality of products it is extremely important to keep the number of samples low; this is especially true for consumer liking studies. Presenting more than 8–12 samples to consumers is very difficult. This has to do, among others, with consumer attention and also fatigue. Therefore, there is always a trade-off between statistical precision and what is advisable to apply in practical terms. The small number of samples show how often inferential techniques for individual consumers are of less importance in such a context. Therefore, the overall pattern of consumers’ liking patterns should be given major emphasis. However, in situations where it is feasible to collect the preference of many samples, such as for sound (Berget et al. 2021) and image data, and it is reasonable to use inferential techniques, bootstrap would be the recommended choice for the QR preference mapping. This resampling method does not require indeed assumptions on the error distribution to quantify uncertainty (Gould 1993; Kocherginsky et al. 2005; Tarr 2012).

The main idea underlying the paper is that QR perspective can offer a complementary information to the classical approach useful to highlight further sources of variability in consumer preferences that would otherwise be neglected. QR has been exploited in the classical external preference mapping focusing on linear models. Future research line could concern the use of nonlinear models, when sample size allows it.