1 Introduction

While serving consumers online provides multiple benefits for online retailers (e.g., reaching consumers worldwide), it is also tied to some disadvantages inherent to distance trading. Especially product (fit) uncertainty (Hong and Pavlou 2014) and the missing touch and feel of products (Shulman et al. 2011) result in large amounts of product returns. These product returns are not only causing enormous costs for online shop operators (Samorani et al. 2019; Yan and Pei 2019) but additionally negatively affect the environment (Dutta et al. 2020; Pålsson et al. 2017). The number of returns shows to be very high in the online fashion business, in particular, due to its less standardized products (Difrancesco et al. 2018; Saarijärvi et al. 2017), the need for clothing to fit correctly (Gallino and Moreno 2018; Gelbrich et al. 2017) and the importance of apparel’s texture (Ofek et al. 2011). Since handling the return policy more or less lenient in this business will trigger higher purchase frequencies or prevent consumers from buying products (Hjort and Lantz 2016; Janakiraman et al. 2016), it is crucial to ascertain the golden mean for managerial implications.

While the vast majority of previous studies focused on finding optimal countermeasures for keeping return rates low without scaring off potential customers before or after purchasing separately, we contribute to the literature by examining the problem of returns holistically. Therefore, we extend the two-step decision perspective from Wood (2001), according to which online purchase decisions are divided into the (first) decision for or against a purchase, and the (second) decision for or against keeping the product, by analyzing measures to prevent product returns in three stages. These measures comprise supporting consumers searching for fashion products (pre-purchase stage), assistance in the ordering process (purchase stage), as well as strategies inducing consumers to keep the product (post-purchase stage). While the vast majority of literature focuses on preventing returns either before or after the purchase, we enable a direct comparison of measures for reducing returns by investigating all three stages with the same methodological approach. We use Kano's “Theory of Attractive Quality” (Kano et al. 1984) as a basis, from which we have respondents categorize several measures. Besides, to the best of the authors’ knowledge, we are the first to apply (segmented) Kano’s method in product returns and thus revealing those return measures that increase customers’ satisfaction the most.

Furthermore, we address potential solutions for product returns by implementing the most recent technological advances, such as virtual fitting of articles or 360° views of the products. Hence, we want to shed light on how consumers evaluate measures for preventing product returns in the context of online fashion shops at each of the three stages and to what extent they affect consumers’ satisfaction. By answering this question, we cover recently postulated research gaps (Janakiraman et al. 2016; Samorani et al. 2019) and indicate how managers could efficiently allocate financial budgets regarding their return policy.

Therefore, this study is structured as follows: first, we illustrate return management, its most recent developments, and technological improvements, as well as drivers of returns. We then describe our methodical approach leading to the results yielded. After discussing these, we end with a conclusion and directions for future research.

2 Theoretical background

With ever-increasing numbers of online shopping orders, the issue of product returns also becomes more critical. Even if the current return ratio remains constant, the consequence will negatively affect the environment heavily (Dutta et al. 2020; Pålsson et al. 2017). Furthermore, product returns constitute a cumbersome, unpleasant task for companies and consumers, likewise. As the e-commerce industry still struggles to provide sufficient and appropriate product information for customers to prevent (or at least reduce) returns (Gelbrich et al. 2017), and thus might not be able to offer suitable solutions soon, it is essential to explore product returns in comprehensive depth and based on recent technological advancements. Following the theoretical framework of the Confirmation–Disconfirmation paradigm in the context of products bought online (Hong and Pavlou 2014), the satisfaction with the delivered product (post-purchase) might be (1) lower than expected, resulting in a negative confirmation, (2) as expected resulting in zero (dis)confirmation, or (3) higher than expected resulting in positive confirmation.

2.1 Return management and recent developments

The emergence of a return is to be understood due to a comparison of expectations (while shopping online) and reality (when receiving the product), as illustrated in Fig. 1. In the context of fashion, the expectations regarding the nature of the article (correct article) and the fit (correct fit) should be understood as a logical consequence, whereby the comparison of the expectations to the actual product can be moderated by curating the offer, e.g., through personal or personalized outfit recommendations. Resolving the information gap then leads to satisfaction or dissatisfaction with the ordered article. Nevertheless, it can be assumed that satisfaction alone does not directly affect the return behavior. A customer can be satisfied with a delivered article but still return it (selection order of several sizes or budget reached). It is also conceivable that an unsatisfied customer does not make a return but avoids buying a product from the supplier/manufacturer as a result. The influence of perceived service quality and its influence on the return behavior (e.g., delivery time) was not considered in this study.

Fig. 1
figure 1

Pre- and post-purchase stages with corresponding return prevention starting points

To categorize product returns properly, we refer to product returns before the purchase decision as “return avoidance”, whereas those return measures after the purchase decision will be named “return averting”. In the second case, the aim is no longer to influence expectations towards an article but to negotiate with the customer about the intended return. This negotiation can be done with, for example, money or an appeal. We assume that it is easier to negotiate with a customer satisfied with the article than with dissatisfied customers. In the latter case, the company must also consider whether suppressing the return is beneficial for the customer relationship or conceptualize an offer, which avoids lasting customer annoyance. In general, these measures should be applied with caution because once customers have understood this mechanism, they could actively use it to their advantage and change their ordering and purchasing behavior in this direction (Gelbrich et al. 2017).

The return literature dealing with these issues could be segmented into different groups based on their approach (Table 1). While some studies model different scenarios based on researchers’ assumptions (Difrancesco et al. 2018; Dutta et al. 2020; Letizia et al. 2018; Li et al. 2019; Ülkü and Gürler 2018) or founded on observable online shopping data (Gallino and Moreno 2018; Hjort and Lantz 2016; Lohse et al. 2017; Minnema et al. 2016; Petersen and Kumar 2015; Rao et al. 2018; Sahoo et al. 2018; Samorani et al. 2019; Walsh et al. 2016), we analyze measures of return avoidance and averting, by focusing on the customer’s voice; as finally, customers’ evaluation contributes to a more or less successful implementation of these measures. Thus, we conducted a literature review about recent articles (published between 2015 and 2020) that either include “product return”, “return prevention”, “reverse logistics”, or “return policy” in common scientific databases. After screening them by abstracts, we highlight those incorporating customers’ viewpoints derived from survey-based investigations.

Table 1 Recent Studies Investigating Returns from a Consumers’ Viewpoint

It becomes evident that most studies investigating return measures from a customer’s viewpoint explore either the purchase or the post-purchase (returning) stage, and thereby not allowing a direct comparison of the effectiveness of the measures analyzed. In the same vein, the meta-analytic review by Janakiraman et al. (2016, p. 234) concludes that “[p]rior research has largely examined these effects separately”. In contrast, studies interviewing the same respondents on product return prevention measures for both purchase and post-purchase are scant. To the best of our knowledge, we are the first to analyze return prevention measures for all three stages by applying the Kano method.

2.2 Drivers of returns and potential solutions

Whether to buy online instead of in a store also depends on the disadvantages of the mail-order business (Hong and Pavlou 2014; Shulman et al. 2011), which are common knowledge. If someone orders online, they have already familiarized/acquainted with it in advance (Ülkü and Gürler 2018) and might even take advantage of vendors’ lenient return policy (Pei and Paswan 2018).

Reasons for product returns are multi-faceted and very individualistic in the field of fashion in particular, but not all cases of product returns can be prevented. Based on a recent investigation with n = 1024 respondents (ibi research 2017), the drivers of product returns reveal to be product did not fit (62%), consumers did not like the product (39%), the product was defective or delivered in damaged conditions (30%), the product was not as described (30%). Followed by multiple variants were ordered (20%), wrong delivery (7%), delivery took too long (5%), the product was found cheaper in another shop (2%) or other reasons (2%), which is comparable to prior investigations (Gelbrich et al. 2017; Lee 2015). These drivers identified (Table 2) could be condensed into an information gap related to return reasons and those caused by online shopping operators' service. However, in some cases, customers return articles due to consumer behavior related causes, such as impulsive purchases (Ülkü and Gürler 2018), so-called “showrooming” behavior (Bell et al. 2018), or not fulfilled returns, which might result in dissatisfaction. Besides this categorization, ordered products were intended to be worn as a set and cannot be delivered or combined fall in-between consumer behavior and fulfillment/service reasons.

Table 2 Three main categories for product return reasons

Based on these reasons, we collected potential measures for the three stages (Fig. 1) in Table 3. These are substantiated based on literature and illustrated by practical examples, representing the measures used in our investigation. (As we intend to explore customers’ viewpoint for technological-advanced and state-of-the-art measures, some of the items applied have not yet been investigated in established journals.) Apart from that, we focused on rewarding rather than sanctioning measures. Most online retailers try to avoid the adverse effects of a less lenient return policy, such as ordering elsewhere (Gelbrich et al. 2017). This avoidance is in line with the operant conditioning theory (Skinner 1965), where the intended customer behavior (from a retailer’s perspective) is assumed to occur more frequently when this behavior is linked to a pleasant consequence (“positive reinforcement”). This theory has been applied in many areas of consumer behavior research (Wells 2014), such as online product selections (Perotti et al. 2003), corporate behavior (Vella and Foxall 2013), the effectiveness of TV commercials (Nathan and Wallace 1971), and even in the context of product returns (Gelbrich et al. 2017).

Table 3 Investigated Measures Derived from Practical Applications and Current Literature

Hence, we also incorporate recent measures yet only discussed in blogs and contained in market research reports. Additionally, we assume an influence on the categorization by the market standard (MS) and the degree of user integration (DoIU).

Unfortunately, there are no relevant publications on the MS or the diffusion of the measures. We have decided to rate the MS in three dimensions: 1 = very common, 2 = partly common, 3 = very rare/not (yet) existing. For this purpose, we went among others through the top 20 German fashion online stores in 2018 to be consistent with the customers surveyed, who also live in Germany. In our assessment, only three measures can be considered very common (MS = 1): “360° view”, whereby we have also included an all-around photo series. Personalized newsletters were also offered by all providers, although not every newsletter contained a personalized element. We categorized measures as partially common (MS = 2) if they were not shown consistently or only for selected articles in the top 10 providers, which was the case with “catwalk videos” or “information model size.“For measures that were hardly shown (MS = 3), we had to search outside the top 20. In general, it can be said that measures from the post-purchase phase are hardly widespread (and challenging to investigate from an outside position), probably also because a mention of returns after purchasing could encourage a considered return. Bonus points for retained goods are an exception.

For the DoUI, we have also decided on three categories: ○ = no user interaction needed, ◑ = user interaction needed, but can still be used without, ● = can only be accomplished by integrating the user. Many of the measures do not rely on active user participation. We have assigned “No user interaction needed” if, on the one hand, no direct interaction is required, and the result does not change with even partial user interaction (e.g.,”Size advice—figure types”). This category is followed by measures that deliver results even without user input, but user interaction leads to improved results (e.g.,”Favorite article for comparison”). The highest requirements are measures that can only be achieved together with the user. These include virtual try-on or self-measure.

2.3 The relationship between expectation fulfillment and satisfaction

The effect of the individual (service) attributes on customer satisfaction is not always linear (Kano et al. 1984; Shahin et al. 2017; Shahrestani et al. 2020) and changes over time (Kano 2001). We would like to provide an informative insight into the different measures’ expected effects with our work from a customers’ perspective. For this purpose, various approaches are available (Mikulić and Prebežac 2011). Kano’s model (Kano et al. 1984) is a proper way to capture effects in the design stage of a product or service and later to derive managerial strategies. Therefore, we will use Kano’s method for our investigations.

In the literature, Kano’s model is not precisely distinguished. The following shall apply to this work: Kano’s model (Matzler 2003) is the term used to describe the work of Kano (1968, 1987, 1995, 2001) and Kano et al. (1984), which is often referred to as “Theory of Attractive Quality”. Kano describes that the relationship between expectation fulfillment and customer satisfaction is not always linear. It should serve us as a theoretical concept for the multi-factor structure in customer satisfaction. Kano’s model is in contrast to the Kano method. It describes a procedure that can be used for categorization.

According to Kano et al. (1984) and Kano (2001), there are four primary patterns for cause-effect relationships: must-be, one-dimensional, attractive, and indifferent (Fig. 2) supplemented by two relatively rare and theoretical cases (Matzler et al. 1996; Mikulić and Prebežac 2011; Nilsson‐Witell and Fundin 2005) from which strategies for companies are derived.

  • Must-be (M) items are items for which poor performance has the strongest effect on customer satisfaction in its entirety; meeting or even exceeding expectations cannot increase overall customer satisfaction. Strategy: Securing primary performance via, e.g., service level agreements, following no further investment.

  • One-dimensional (O) items are items with a direct influence on overall satisfaction for good and bad fulfillment. Strategy: Ensure primary performance and increase it further.

  • Attractive (A) items are usually not expected by the customer and, if present, lead to an improvement in satisfaction. Absence or poor performance does not affect overall satisfaction. Strategy: If the necessary services (M and O) are acceptable, they can differentiate in the market.

  • Indifferent (I) items have no neither a positive nor negative influence on customer satisfaction. Strategy: Avoid Investments.

  • Reverse (R) items lead to a decline in satisfaction when present, but their absence leads to an improvement. Strategy: Not only should any investment be avoided, but consideration should also consider whether a consciously externally communicated demarcation can be perceived as A.

  • Questionable (Q) items are forfeited if none of the five correlations listed could be determined; subsequently, no general strategy applies.

Fig. 2
figure 2

Kano’s model (Kano et al. 1984) with the illustration of its life cycle (Kano 2001)

Kano (2001) also addresses a dynamic change over time. In his view, a successful quality element of a product or service passes through this sequence or life cycle: I → A → O → M. Nevertheless, also other sequences can be found. Nilsson‐Witell and Fundin (2005) have shown that when an adoption level is taken into account, the answers can be categorized differently. For example, one service studied was during introduction I and later A. Respondents referred to as early adopters already categorized this service as O or even M instead of A. Further studies have also shown in time series comparisons that the attributes change dynamically over time. Hölzing (2008) examined services for people with diabetes at an interval of 6 months (2005, 2006), Raharjo et al. (2010) for characteristics of notebooks with ten data points at a 2-month rhythm, Löfgren et al. (2011) quality attributes of commodity packaging (2003, 2009) and Stöcker and Nasseri (2020) touchpoint satisfaction of customers of an e-commerce retailer (2011, 2013).

2.4 Hypothetical framework

We will now derive our hypotheses about the measures presented in Table 3. These can be divided into two main groups: characteristics that concern the measure itself (time effects, type of incentive, and user interaction) and variations in customer attributes (age and order frequency).

2.4.1 Measure-related hypotheses

As illustrated in Fig. 2, new service features will first be evaluated as A and perceived as O with a linear increase regarding satisfaction and, finally, the M dimension (Kano 2001). However, online shopping operators need to consider the features’ adoption rates in terms of time and incorporate the potential competitive advantage by being the first to offer specific measures. According to the law of differentiation dynamics, the prospective competitive advantage will diminish if competitors are already providing such features (Rudolph and Becker 2003). While some measures (those with high levels of MS, see Table 3) are already widely implemented in online shops, others are still in an evolving stage with only a few practical examples existing. Therefore, we assume:

H1

Measures with a low level of MS are more frequently categorized as I and A instead of O and M than those representing a high MS level.

Within the post-purchase stage measures, those related to compensation or rewards might be perceived as positive, as they will trigger reinforcement according to the operant conditioning theory (Skinner 1965). Hence, they result in higher consumer satisfaction than other sanctioning measures (such as displaying return behavior or return impact information). So, measures that reward consumers seem to pay off more than sanctioning them (Gelbrich et al. 2017; IFH Köln and AZ Direct 2016). Although we have excluded re-purchase behavior from this study, it should be evident that, especially in a buyer's market with many suppliers, respectively, a negative sanction leads to customers' churn. Therefore, the implementation of these measures must follow with great sensitivity. Accordingly, we hypothesize that monetary measures (“Discount on next order”, “Discount on current order”, “Bonus points for purchase”, “Bonus points for non-return orders”, and “Waiver of shipping costs”) will result in a higher increase in customer satisfaction, especially in contrast to measures sanctioning customers (“Display of the return behavior” and “Return impact information”).

H2

Monetary measures have a stronger positive influence on customer satisfaction (CS+) than non-monetary measures.

In the measures described for the avoidance of returns, some can only succeed with the user’s active collaboration (see Table 3, column DoUI). Here, such measures’ success depends on customers’ willingness to engage in these measures (Lai et al. 2014). Since the fashion market is a buyer’s market, we assume that these are less appealing.

H3

Measures that require the direct engagement of users are less frequently categorized as A, O, and M compared to other measures.

2.4.2 Customer-related hypotheses

Technical innovations undergo a life cycle, according to Kano (2001). We assume that newer measures, which cannot be considered the MS, are preferred more by younger than older customers. In this study, the measures “Virtual fitting of articles”, “Self-measurement via webcam”, “Curated shopping”, “Assisted shopping”, and “Online shop as a social platform”. This effect is actual true for the millennial generation, who possess excellent technological skills (Ladhari et al. 2019).

H4

Innovative measures positively influence customer satisfaction (CS+) by younger customers.

In line with extant literature (Gelbrich et al. 2017), we assume purchase frequencey to moderate the categorization of return averting measures. Meanwhile, customers with high shopping frequency are used to handle product returns as part of the shopping online (Ülkü and Gürler 2018). Hence, they easily hazard the related consequences and sometimes even take advantage of a merchant’s lenient return policy (Pei and Paswan 2018). Therefore, we expect:

H5

Customers with a high purchase frequency tend to categorize the queried measures in the three purchase stages as A and O.

With the hypotheses that have been formulated, we try to determine structures within the individual measures, which can later be generalized. Using the segmented Kano perspective, we also investigate whether the answers already show signs of a life cycle for the measures. For this purpose, we use a structured questionnaire, which also includes questions on buying and return behavior. Thus, we hope to isolate additional descriptive characteristics that can profile our findings even more precisely.

3 Research design

To shed a light on the customers’ voice, we decided to use an online questionnaire sent to all customers. In this questionnaire, we asked one functional and one dysfunctional question for each measure; these questions were combined in the evaluation.

3.1 Survey and descriptive statistics

While many studies in return management literature applying self-report surveys suffer from acquiring an adequate sample and use student samples instead (Gelbrich et al. 2017; Oghazi et al. 2018; Pei and Paswan 2018), we want to overcome this issue by enquiring actual customers from a leading online shop in Germany. This approach provides multiple advantages. First, in contrast to students, actual customers exhibit higher income levels and, therefore, higher purchase power (Iyer and Eastman 2006), leading to more realistic responses regarding price issues. Second, even though elderly consumers represent a fast-growing segment in e-commerce, literature on consumers’ online shopping behavior older than 50 years is still very scant (Lian and Yen 2014) and should be examined. Third, while students’ answers for hypothetical scenarios might not reveal their actual shopping and return behavior, we expose our questions within the determined online shop's framework addressing this specific online shop's customers, which results in more realistic findings. For our research, we had the opportunity to contact customers of BAUR Versand (baur.de), a top 10 online retailer for fashion in Germany (EHI Retail Institute 2019). BAUR's product range focuses on fashion, shoes, and home, including furniture, and concentrates primarily on female customers between 40 and 55. BAUR relies primarily on well-known brands, and around 90% of the business volume is handled via the online shop.

The invitation to participate in the survey was sent by e-mail on December 14, 2018, to all BAUR customers providing the opportunity to answer the questionnaire until January 18, 2019. A raffle of 15 shopping vouchers worth EUR 20 for the BAUR online shop was announced among all participants in the invitation. To not overstrain respondents with the very time-consuming questionnaire, three surveys with different clusters of measures were used, randomly assigned to the e-mail addresses. All questionnaires had the same structure and differed only in the return measures exposed using the Kano methodology (survey 1: 10 measures, survey 2: 11, and survey 3: 9, see Table 5). In the beginning, the aim and purpose of the study were explained. It was pointed out that this was a joint research project of BAUR and students of a near-by University. The initial questions on the current ordering and returns behavior were subsequently asked (no further validation via the customer database). The self-assessment of the respondents serves, on the one hand, as an icebreaker question; on the other hand, the respondent should reflect his or her return behavior at this point and thus form the basis for further answers. They were following these questions by the evaluation of one of the three clusters of measures. The Kano questioning technique, unusual for many respondents, was first introduced using an example. Finally, presenting the questions on socio-demographics and space for comments and the opportunity to participate in the raffle. Pretests helped to test the comprehensibility of the questions and the structure during the questionnaire development.

For describing the respondents in more detail in the following analysis, other characteristics were queried: (a) On the one hand, the current ordering behavior, whereby the ordering frequency, the average expenditure on fashion, for who is mainly purchased, where individual product ranges are purchased preferentially (online or offline), whether these purchases are mainly spontaneous or planned and how fashion buying online is generally perceived. Afterward, (b) the current return behavior: how often a return took place, the reasons for it, how complex a return is perceived, and whether the return behavior differs between orders from different shops. Finally, in addition to age and gender, (c) the residence place’s size was also surveyed to detect any differences in an assumed imbalance of supply.

A total of 8393 complete questionnaires were evaluated (survey 1: n = 2789 completion rate 68%, survey 2 n = 2855 completion rate 70%; survey 3 n = 2749 completion rate 64%). The three samples are structured as follows about their purchasing behavior and socio-demographic characteristics (for full detail, see Appendix).

The majority of customers order fashion online between once a month (30.6%) and once a quarter (32.6%). At the same time, 85.8% of those surveyed stated that they spend up to 150 EUR. Regarding their shopping behavior, 22.8% describe themselves as planning, 36.0% as partly/partially planning, and 41.1% as browsing and discovering. Only very few of the respondents (8.0%) answer that they avoid online shopping when possible. Besides, the vast majority (62.0%) answered that they love buying fashion online. Concerning the number of returns, customers state that they have also returned in 32.6% of (all) orders transacted.

Regarding the reasons for a fashion return, 87.1% of the respondents answered with “Item does not fit” 45.9% with “I do not like this item” 41.6% ordered several sizes to choose from, 21.0% “not as described” and 4.2% bought more to choose from at home due to a promotional measure. In the upper third of the scale, 55.4% rate a fashion return’s effort as “not elaborate”. Here too, bias is to be assumed from the survey of active online shoppers. When asked whether the return behavior differs among different providers, 52.4% explicitly answered “no” while 77.7% of the answers tended to be “no” in the first half of the 6-point Likert scale.

Among the respondents, 79.8% are female, 29.1% are between 29 and 44 years old, 32.9% are between 45 and 54 years old, and 38.1% are older than 55, slightly above average in small and medium-sized cities (5 to 100 thousand inhabitants) and firmly below average in cities with millions of inhabitants.

3.2 Categorization of the measures

In order to determine the cause-effect relationships for each item in Table 3, two questions were asked: the functional (“imagine that … has [item] …”) (Kano et al. 1984; Matzler et al. 1996; Mikulić and Prebežac 2011) and dysfunctional (“imagine that … has not [item] …”) questions (Berger et al. 1993; Matzler et al. 1996; Nilsson‐Witell and Fundin 2005). The answer is given on an ordinal scale with a middle option [“(1) I like it that way”, “(2) It must be that way”, “(3) I am neutral”, “(4) I can live with it that way”, “(5) I dislike it that way”]. The classification can be determined via the Kano table by combining the two answers to the two questions (Table 4).

Table 4 Kano Table: categories derived from answers to the (Dys-)functional questions (Kano et al. 1984)

The characteristic is now derived from the Kano table. If all survey results of one question are plotted as value pairs in a coordinate system, the characteristic Kano curves are obtained (see Fig. 2).

In the literature, however, another approach is also common. In this case, no curves are shown; the character is reflected here in the position of the individual measures in the respective quadrants. This approach presents the positive and negative impact on customer satisfaction as two coefficients (Berger et al. 1993; Shahin et al. 2013; Shahin and Zairi 2009). Assuming a positive factor on the customer satisfaction (CS+) for answers falling into classes A and O, a negative factor (CS) for O and M. Answer combinations from classes Q and R are not considered. The results are then displayed graphically in a coordinate system representing the two axes CS+ and CS orthogonally. The two coefficients tell us how often an item has been categorized into the mentioned groups. For CS+, the mentions are counted positively influencing satisfaction when the expectation is fulfilled positively (A and O). For CS those where a negative fulfillment negatively influences satisfaction (O and M). A high value consequently shows a high correlation with customer satisfaction. Since Kano’s categorization can only be interpreted by translating the terms into one of the corresponding curves, the coefficients dispense with this step. The positive as well as the negative effect, can be read off directly.

$$CS^{ + } = \frac{\# A + \# O}{{\# A + \# O + \# M + \# I}}$$
$$CS^{ - } = - \frac{\# O + \# M}{{\# A + \# O + \# M + \# I}}.$$

#A, #I, #M and #O represent the response frequencies of the categories or the number of responses categorized as A, I, M, or O. The indices are between 0 and 1 and − 1, respectively, and reflect the impact on satisfaction. From the location of the points, their categorization is again apparent. The coordinate system is divided into four quadrants:

$${\text{A}},\;\;{\text{ if}}\;\left\{ {\begin{array}{*{20}l} {0.5 \le {\text{CS}}^{ + } \le 1\quad {\text{and}}} \hfill \\ {0 \ge {\text{CS}}^{ - } > - 0.5} \hfill \\ \end{array} } \right.$$
$${\text{I}},\,\;{\text{if}}\,\left\{ {\begin{array}{*{20}l} {0 \le CS^{ + } < 0.5\quad {\text{and}}} \hfill \\ {0 \ge CS^{ - } > - 0.5} \hfill \\ \end{array} } \right.$$
$${\text{M}},\;\;{\text{if}}\left\{ {\begin{array}{*{20}l} {0 \le {\text{CS}}^{ + } < 0.5\quad {\text{and}}} \hfill \\ { - 0.5 \ge {\text{CS}}^{ - } \ge - 1} \hfill \\ \end{array} } \right.$$
$${\text{O}},\;\;{\text{if = }}\left\{ {\begin{array}{*{20}l} {0.5 \le {\text{CS}}^{ + } \le 1\quad {\text{and}}} \hfill \\ { - 0.5 \ge {\text{CS}}^{ - } \ge - 1.} \hfill \\ \end{array} } \right.$$

If points are close to the origin, no influence can be proven at all. If a point lies precisely in the middle, a positive and, at the same time, the negative influence is detectable in 50% of the respondents. The coordinate system position can now be determined (see values in Table 5) using the formulas described in Chapter 3 or read directly from Table 5. For example, the item “Discount on next order” has a CS+ of 0.8 and a CS of -0.23. It can therefore be found in quadrant “A” in the upper left corner.

Table 5 Overall assessment of possible measures

Also, the total strength (TS) represents the number of mentions categorized as A, M, or O compared to all mentions. Items with a high TS also have a strong influence (positive or negative) on total customer satisfaction. The TS serves to prioritize the individual items concerning their effect on customer satisfaction. Improvements to items with a high TS should have a high impact on the change in customer satisfaction:

$${\text{Total}}\;{\text{strength}} = \frac{{\# {\text{A}} + \# {\text{M}} + \# {\text{O}}}}{{\# {\text{A}} + \# {\text{O}} + \# {\text{M}} + \# {\text{I}} + \# {\text{Q}} + \# {\text{R}}}}.$$

Recently, other papers apply an additional variant to the method described above. The Segmented Kano perspective descends one level deeper by searching for clusters within the answers. The new approach makes it possible to identify different customer segments with different expectations, otherwise not visible in the aggregated form. For this purpose, the answers enter the functional and dysfunctional question as a metric feature into the cluster analysis (Baier, Rese and Röglinger 2018) or using one-mode non-metric cluster analysis concerning the derived categories (Rese et al. 2019). The number of clusters is then determined iteratively under the observation of the Bayesian Information Criteria (BIC) concerning the likeness functions.

4 Findings

Table 5 displays the overall assessment of the measures based on Kano’s model, indicating category frequencies, the total strength (TS), the customer satisfaction index CS+, and the customer dissatisfaction index CS.

The surveyed measures’ results are evaluated solely as A (not expected, but if there is a positive influence on overall satisfaction) or I (no evident influence on overall satisfaction). Regarding category A, the measures “360° view”, “Discount on current order”, “Discount on next order”, “Bonus points for purchases”, and “Waiver of shipping costs” stand out. More than 50% of the respondents rated these measures as A, suggesting that these measures could substantially contribute to customer satisfaction. In contrast, the measures “Curated shopping”, “Assisted shopping”, “Commenting on reviews”, “Online shop as a social platform”, “Photos from social networks”, “Outfit recommendations from influencers” and “Return impact information” are also categorized I to more than 50% of the mentions. Here, no influence on customer satisfaction is expected when implementing the measures. None of the measures can be described as M. The only measure that could be considered O is “Waiver of shipping costs”. Here, the closest mentions are for A 1373 and O 900. Measures are categorized as R if their interrelationship towards satisfaction is precisely the opposite. An exemplary implementation has a negative effect and a bad one, a positive effect on satisfaction. Here, the measure webcam size is particularly striking. Categorized as I, with 1125 mentions, but with 996 mentions, it is also very close to R.

Considering the categorization of the measures and the MS’s degree that we have assumed (H1), no consistent picture emerges (Table 6). The measures investigated are distributed equally between I and A, depending on the MS’s level. Interestingly, even measures that have been in the market for a long time and established are only categorized as A.

Table 6 Categorization of measures concerning the assumed MS

Nor can a uniform picture be formed for the DoUI (H3, Table 7). Suppose we additionally exclude return averting measures, which do not prevent returns in the narrower sense but negotiate the conditions under which the customer would refrain from returning, a similar distribution between I and A can be observed here as well. Consequently, this hypothesis must also be rejected.

Table 7 Categorization of measures concerning the assumed DoUI

Figure 3 shows all measures based on their impact coefficients CS+ and CS. Here, too, the same picture emerges. All measures presented are located in the two quadrants I and A. Furthermore, an exciting pattern becomes visible: most measures with a monetary reward show the most considerable positive impact (“Waiver of shipping costs”, “Discount on next order”, “Discount on current order”, “Bonus points for purchases”). As we stated in H2, monetary measures have a stronger positive influence on customer satisfaction than the other examined measures and measures sanctioning the customer. These are followed by measures that primarily result in an improvement of the presentation by the vendor (“360° view”, “Find out individual size”, “Size advice—figure types”, “Size recommendation—previous purchases”, “Presentation via video”, “One model wears all sizes”, “Size recommendation—previous purchases”). A third block can be seen in the I quadrant. This cluster contains measures that either include external content in the shop (“Online shop as a social platform”, “Outfit recommendations from influencers”, “Commenting on reviews”, “Photos from social networks”), require the customer to be involved (“Assisted shopping”, “Self-measurement via webcam”, “Photos from social networks”) or reflect their return behavior. Since our study examines all stages of the purchasing process, we can clearly show this hierarchy of measures at this point.

Fig. 3
figure 3

Depiction of the overall assessment of possible measures (n = 8396)

We then have examined all the proposed measures regarding dependencies (linear or segmental) in the answering behavior to their age (H4) and shopping frequencies (H5) with no significant differences found.

We apply the before-mentioned segmented Kano perspective to reveal more meaningful insights based on the overall results and derive more clear implications. We have used the well-known two-step clustering approach, according to Chiu et al. (2001). In each record, each measure is categorized according to Kano’s evaluation table. For the resulting nominal data matrix, independent multinomial distribution of the categories over the clusters’ attributes is assumed. The optimal number of clusters is now determined iteratively, taking into account the BIC. In this case, three clusters have proven to be ideal.

From the results in Table 8 and Figs. 4, 5 and 6, initial findings can already be deduced. A closer look reveals that the three surveys' segments follow similar patterns: each segment can be assigned to a quadrant. We named segments primarily in I “Indifferents”, those in A “Enthusiastics” and O “Demanders”. A segment in M we would call “Taken-for-granteds”. Nilsson‐Witell and Fundin (2005) have found a similar starting position in their study for “e-service”. When introduced, the service was perceived as I, it became later A. They investigated the mentions classified as A with a technology adoption level and found segments in O and M, which they refer to as “early adopters”, a term also used to the diffusion of innovations theory (Rogers 1962).

Table 8 Segment-specific category frequencies for each of the three surveys
Fig. 4
figure 4

Depiction of the assessment of possible measures survey 1 (n = 2792)

Fig. 5
figure 5

Depiction of the assessment of possible measures survey 2 (n = 2855)

Fig. 6
figure 6

Depiction of the assessment of possible measures survey 3 (n = 2749)

These segments can again be depicted graphically, where each graph represents one of the three surveys. In the graphs, there are three data points (segments) for each measure. Different symbols indicate the affiliation to the respective segment. For the sake of clarity, we have refrained from displaying all 84 data points in one graph. Therefore, we have staggered the graphs according to the surveys. We see more information value in directly comparing the clusters’ positions to each other for each item. Also, the clusters were calculated independently for each survey.

Analogous to Kano’s life cycle theory, there are already segments with statistically significant differences. In the segmented Kano perspective, not the measures differ among themselves, but the persons confronted with the measures. It is possible to see in Figs. 4, 5 and 6 that a general MS apparently has no influence or otherwise cannot be determined in the first place. Instead, it seems that customers have individual expectations regarding the measures, referring again to H1,

Naturally, measures that show the highest and lowest influences aggregated (Table 5) also show the strongest or lowest influences in relative terms for the individual segments. To have a strong influence overall, many respondents have to answer in the same way, which is also the case after dividing into the three segments. Therefore, measures with a very high or meager impact on customer satisfaction are found in similar (relative) positions after splitting into segments.

To provide a more detailed characterization of the segments, we also investigated them regarding their buying and return behavior and socio-demographic. No significant differences were found here either.

5 Discussion

5.1 Theoretical contribution

While the majority of previous studies analyzed return behavior either before/during, or after the purchase decision (Janakiraman et al. 2016), we contribute to the literature by expanding the view on return management to a 3-stage approach, which is investigated in the pre-purchase, post-purchase, and purchase stage based on a large data pool of actual customers (n = 8396). This holistic approach reveals that return measures in the post-purchase and actual purchase stage are more applicable to increase consumers’ satisfaction than those related to the pre-purchase stage. For this purpose, we have extended the already existing approaches in Fig. 1. Besides, to the best of our knowledge, this study represents the first to analyze consumer return behavior by applying the Kano method. Hence, we enable an overview of product return avoidance and averting measures to satisfy consumers the most. This juxtaposition shows for the first time how strongly monetary approaches differ from the remaining measures. Without a combination of the three stages, this finding would not have been possible.

5.2 Managerial implications

Returns in the mail-order business, especially fashion, are a great nuisance for the customer, the company, and the environment. However, not all of the proposed measures can be effectively implemented by a company. It is, therefore, essential to focus on a few but effective measures. Our paper offers new insights in this respect. As one might expect, measures that positively sanction customers are prevalent. Since these, in turn, actively influence pricing policy, such measures must be weighed up carefully. The next exciting group includes measures that aim at improving the presentation of merchandise without requiring further effort from the customer. The “360° view” stands out in particular. This measure has even prevailed over more elaborate presentations such as “Model type photos”, “Presentation via catwalk videos”, “Virtual fitting of articles”, "Presentation via video”, or “Information model size”.

The second important finding is that the measures were categorized as exclusively I or A on the overall level. A lack of or poor performance in these measures still has little effect on satisfaction. Two conclusions can be drawn from this: either returns are hardly an issue for the respondents. More than half of the respondents answered that they do not consider returns to be costly (the bias could be that only active mail-order customers participated in the survey). The other conclusion may be the real MS in terms of avoiding and averting returns is still deficient, and so are the expectations.

However, a more precise segmentation into clusters already reveals the first one-directional measures. Thus, there are already customers whose expectations are significantly higher and whose absence or poor performance leads to dissatisfaction. In the sense of early strategic detection, this customer group should be observed more closely. If this group grows significantly over time, investment in return management is no longer just nice to have but essential for customer satisfaction. Ultimately, it can be assumed that measures will migrate to the M quadrant in some time, which means that investments in this area will not even increase satisfaction but will only prevent dissatisfaction. Unfortunately, we were not able to describe these clusters more precisely with the customer characteristics queried. Further research in this area would, therefore, be highly desirable.

In brief, this means that customer expectations in return management are generally still in a very early life cycle stage. However, this does by no means indicate that this is unimportant. On the contrary, vendors can set themselves ahead of the competition and gain a competitive advantage by, for example, improving the presentation of their products. Our categorization of the MS for Germany clearly shows how few measures can already be considered established. The online market, which is still growing dynamically, will also be joined by different groups of consumers who can no longer be described as early movers. Here the demands will change even more significantly, also concerning return management. The current social discussion in Western countries is also bringing the environmental impact of human activity more into focus. Here, too, vendors can already differentiate themselves from the competition today and use the first-mover advantage for themselves.

Although monetary incentives such as vouchers or discounts promise a high impact, these mechanisms are usually easy to comprehend. On the other hand, we are firmly convinced that a focus on monetary incentives alone does not represent a differentiating feature and can also be easily copied by the competition.

Finally, it should be noted that some measures, albeit unintentionally, can have a negative impact on repeat purchase behavior. A test and learning approach is, therefore, advisable here.

5.3 Limitations and future research

This paper is limited in some respects. First of all, the respondents are active, mail-order customers acquired via newsletters. Potential customers who, for example, do not buy by mail order at all due to the problem of returns are not present. Neither the age nor gender structure is representative of Germany. It is also conceivable that BAUR customers differ from other mail order customers in their attitude to, among other things, new technologies, precisely because in the survey, mainly women with an aging focus over 45 years answered (see Appendix). They differ significantly from the millennial generation regarding their technological skills (Ladhari et al. 2019).

Secondly, the survey was conducted in the German market. Thus, no assertions about possible cultural influences are possible, nor can the industry structure be transferred to other markets without adjustments. Competition may be more intense or extensive, which also affects expectations. For instance, based on the cultural dimensions (Hofstede 1980), Germans are assumed to be more likely to avoid uncertain outcomes (rating: 65), compared to, e.g., Americans (rating: 46) or Chinese (rating: 30; Hofstede Insights 2020). Hence, return avoidance and return averting measures can be expected to be of higher interest among Germans to avoid such potentially wrong decisions.

The formed segments strongly indicate a dynamic over time, but unfortunately, could not be described in more detail using the other characteristics that were queried. Therefore, no further contribution could be made here about the presumed adoption behavior, leaving space for further investigations.

There is also strong evidence in the literature that a very restrictive or inconvenient return policy can also affect purchase and re-purchase behavior, especially in a competitive environment like fashion retail. In our view, this field also still receives little attention in research.

Finally, Kano’s method has its limits. Especially in innovation research, many measures are categorized as I or A. The method can only indicate the current status without providing direct trends for individual attributes’ future progression. A life cycle is only determined retrospectively. Especially in very dynamic markets such as (fashion) e-commerce, new features are often simply trialed without the need to go through a classic life cycle. Therefore, the context of the featured solutions must always be considered. Also, the special questioning is quite time-consuming and requires a high level of concentration in answering it, diminishing long surveys.

6 Conclusion

We wanted to investigate the most effective strategies to counteract returns from a customers’ standpoint. Based on a newly developed three stages process purchase model, a view of several measures have been investigated towards their potential impact on customer satisfaction. Using the Kano method and its subsequent segmented Kano perspective, exciting results were obtained. Among other things, we were able to show that an improvement in the presentation of the products on offer is generally an excellent choice for counteracting returns and that different expectations regarding return management can already be observed today. We thus confirm prior findings, revealing that enhanced product presentation features, such as zooming (De et al. 2013), or in our case a 360° perspective, paves the way for fewer returns or higher customer satisfaction, respectively.

Similarly, photos from social networks or, more generally, alternative product photos are perceived indifferently or might even result in more returns (De et al. 2013). Moreover, we validated that offering virtual fit information enables declined returns (Gallino and Moreno 2018), as virtual reality tools lead to increased customer satisfaction. Generally, our insights emphasize monetary gratifications to represent the measures increasing customer satisfaction the most, which contradicts elder findings derived from online shop return rates below the usual average in the fashion industry (Walsh and Möhring 2015). Besides the nature of gratification and contrast to previous literature, our holistic perspective demonstrated that measures from the post-purchase stage are most likely to increase customer satisfaction, as five measures are among the eight most practical measures (highest CS+). With this work, we hope to have provided valuable insights into the avoidance and prevention of returns, leading to a reduction of returns in practice.