What you pay is what you get?

Complementing the current paradigm change from QoS to QoE, we address fundamental QoE charging issues for Internet services from an end user perspective. Here, key issues arise from gaps of different information contexts involved, which have to be managed when introducing a QoE product. Hence, this paper analyzes the double role of prices for quality perception as well as the impact of QoE on user demand with the help of a fixed point model. Our model is consistent with real-world user behavior that we have observed during comprehensive user trials on quality perception for video on demand services. Based on these results, we propose a simple approach for convergence-based user classification, and discuss the complementarity of willingness-to-pay vs. subjective quality perception in service purchasing situations.


Introduction and related work
For several decades, service quality in communications networks has been described more or less solely in terms of QoS (Quality of Service) parameters, like packet loss rate, delay, jitter, bandwidth etc.In the last few years, however, we observe a clear trend in academia and industry towards more user-centric concepts of service quality, eventually leading to a veritable paradigm change [24].Especially the notion of QoE (Quality of Experience) [12] has rapidly gained in importance as a way to capture the "overall acceptability of an application or service, as perceived subjectively by the end-user" [9].In a much broader sense, more recently QoE has been defined as "degree of delight or annoyance of the user of an application or service", resulting from "the fulfillment of his or her expectations with respect to the utility and/or enjoyment of the application or service in the light of the user's personality and current state" [14].Along these definitions, a comprehensive body of related work has developed QoE framework models as well as corresponding metrics and measurement methodologies [17,13,19,26].
At the same time, the question of how to charge end users for their perceived quality has been largely neglected, even if appropriate charging is widely recognized as an indispensable prerequisite for putting quality differentiation into practice [10] and, in the case of QoS, has led to a plethora of related work (see for instance [3,30,31] for surveys and [2] for an introductory textbook).Hence we argue that, complementing the mentioned paradigm change, research on QoEbased charging deserves much more attention, while, apart from some early contributions [1,5], research has insufficiently addressed this user-centric perspective up to now.This is especially problematic as QoE and utilities from QoE (for instance related to revenues or valuations of a service) are disparate concepts as pointed out in [34].Of course, 5 Page 2 of 20 we are well aware of the fact that the effect of such pricing differentiation cannot reach the end user directly as long as operators stay with their current general attitude of turning their back on negotiating service level agreements (SLA) directly with end users.Nevertheless, understanding what the user really wants (and at which price) will be crucial both for creating future services and for successfully bringing them to the market.
Therefore, in this paper we aim at discussing several fundamental issues concerning charging for QoE, in order to lay the foundations for future research in this field.Significantly extending our earlier work [21,27,23], we propose and analyze a general model for QoE-based charging together with its empirical validation, which is based on the results of several comprehensive user trials described in [18,28,35].As specific novel contributions, our model encompasses the joint effects of pricing on user perception and the impact of QoE on user demand, and allows to derive corresponding fixed points which characterize the equilibrium between QoS, QoE, price and demand for various types of users.The results of our user trials illuminate the convergence behavior of end users, who ponder over time the tradeoff between desired service quality and associated price, as well as a further economical and psychological aspects of QoE-based charging.Such tradeoffs mirror a conflict of interest for users around antagonistic quality and price preferences (as, rationally configured, high quality often comes with a high price and the other way round).To mitigate marketization challenges arising from this conflict, more sophisticated means for conveying such QoE products to the consumer will be needed in the future.
The remainder of this paper is structured as follows: section "Background: context mismatches and information gaps" provides a general analysis of the different information contexts involved in our problem and identifies the gaps between them.Based on this, section "Fixed-point models for QoS-and QoE-based charging" introduces several flavors of a fixed-point model for QoS-and QoE-based charging.Section "End user convergence behaviour" describes the setup of our user trials and presents some key results, which also allow deriving an efficient approach for user classification.In section "Willingness to pay and user interaction behavior", we direct our interest towards end user willingness-to-pay (WTP), before section "Summary and conclusions" concludes the paper with a brief summary and an outlook on current and future work.

Background: context mismatches and information gaps
From a provider perspective, the transition from QoS to QoE mainly aims at better understanding the customer needs in order to improve the customer experience and/or more efficiently use the available resources (i.e., loosely speaking, cost reduction through QoS provisioning, but only as long as customers will not be able to subjectively detect it).So far, however, the QoE concept has not been able to live up to these high promises, one of the reasons being that the economic utilization of QoE data obtained outside of concrete purchasing situations has remained obfuscated.Moreover, it is difficult to imagine that the price for a service will depend directly on the individual level of the experienced quality at the end customer side.In this sense, future "QoE products" will be restricted to offering different QoE levels for different prices, in an attempt to increase user delight.However, also in this restricted perspective it is essential to analyze the systematic influence of price (and price perception) on the quality perception of the end user (and vice versa), and to deal with the question of how to derive utilities from empirical QoE data, for instance using willingness-to-pay (WTP) as an ISP utility metric for a specific service, customer segment and context.
In order to illustrate this in more detail, we follow the topdown approach proposed in [33] and start with describing the big picture on information context(s) in the field of QoE, before distinguishing three corresponding information gaps that hamper the commercialization of QoE.
As depicted in Fig. 1, information (such as QoS or QoE data) is critical for the commercialization of network quality but inherently context-specific.Here, QoS is placed in a strictly technical context without involving any human role. 1he missing human context is introduced with the creation of the QoE concept, within which a user role exists that maps technical and objective quality to a subjective quality appreciation (or "delight" according to [14]).The resulting mismatch of roles can only be resolved through empirical testing and thus has triggered numerous lab and field trials conducted in QoE research.
Hence, let us consider an abstract version of a QoE product which is offered in the context of a market with dedicated size and given characteristics (socio-economic background of population, overall size, etc.).Essentially, this QoE product refers to monetization of knowledge about quality experiences, especially by telcos and firms offering networked services (such as content providers).While direct per-service, per-user and per-usage monetization of QoE information is procedurally and legally challenging as well as highly contextual, indirect utilization of aggregate QoE information can serve the product design with much less effort.For example, QoE products may allow for QoE-aware capacity Page 3 of 20 5 dimensioning by telcos in order to efficiently and effectively occupy attractive market segments, or content providers may design their services for customer groups and their expected most common aggregate usage scenarios.Of course, indirectly QoE-aware products can only partially materialize on quality and price differentiation potentials for network services. 2s basic characteristic of this QoE product, we note that it is formed around the human roles of customers, users, and suppliers, which gives rise to three different types of information gaps (depicted as IG1-IG3 in Fig. 1).Firstly (IG1), we observe a role context mismatch between QoE assessment (= user role, quality perception metric) and commercialization (= customer role, revenue metric).Outside a purchasing situation, users may appreciate the provided quality level (i.e., high QoE), but would not have an interest in spending any money for this service or service quality (i.e., low revenue).In order to bridge this information gap, QoE could be assessed during purchasing situations where pricing p is involved.Hence, the pure quality rating (QoE) will shift to price-affected quality rating ( QoE p ) where sub- jects can express their acceptance for a given quality level under a given price condition.Such a metric can provide solid indication for a purchase to be replicated under similar conditions.A more explicit revenue metric is given with WTP where the maximum revenue is assessed, i.e., referring to the classical customer role.
Secondly (IG2), the provider needs to optimize the QoE product to meet market objectives, typically in terms of limiting costs and maximizing revenues.While costs primarily refer to the capacity investments in order to meet QoS requirements for the QoE product, the revenue is bound to the socio-economic factors of the market, the service expectation of the customers, and the customers' willingness to pay for the product.This optimization is non-trivial, as communicating with and/or conveying the QoE product to potential customers is critical and technically difficult.Moreover, as the QoE product can be classified as an experience good [20], 3 it requires marketing and business strategy efforts especially during the market entrance phase [32].
Thirdly (IG3), QoE trials, bridging the information gap between technical parameters and experience, are inherently bound to their environmental context (scenario, tools, light, noise, stress, etc.), test scenario (e.g., video conference), and specific test conditions (e.g., parameters, bandwidth ranges or other QoS bounds).For instance, a noisy or stressful environment may alter the obtained QoE results substantially.In this sense, QoE is a local metric describing the subjective quality perception, and generalization may only be possible under certain, maybe quite limited circumstances, when sufficient empirical data is collected and no other information gap prevents it.Local normalization 4 of QoE tests, e.g., through QoE training sessions as suggested by ITU-T Rec.P. 10 [9], provides a standard approach to avoid this problem, however leads directly to a third type of context mismatch where global validity of QoE results is lost: measurement practices of QoE are local by definition and do not directly represent a global value, while the controllability of empirical trials (in the field as well as in the lab) requires a certain moderation of test cases and rating behaviours.Hence, while the common practice to use training sessions for reducing the data noise may appear highly reasonable for reasons of testability, QoE is reduced to a perceptive sensing metric where the commercial utilization of results remains challenging.Let us illustrate this last point with an example, and consider a QoE trial which afterwards is replicated with a different parameter range.For instance, suppose we will replicate the HD video QoE trial of [11] or alternatively the trials in [28,29,35], e.g., in a range between 1 kBit/s to 25 Mbit/s, and shift the tested video bitrate by 7 Mbit/s (i.e., x � = x+ 7 Mbit/s where x is the initially tested bitrate, x ′ is the altered bitrate for the replicated test, and 7 Mbit/s is the bitrate offset that is exemplarily chosen).In the original and bitrateshifted case, we may assume that the underlying laws and mechanisms remain identical [23], hence a similarly shaped QoS-to-QoE curve between the considered minimum and maximum quality should result-e.g., a logarithmic or exponential shape (cf.Fig. 2 based on the data of [4] and [11], in this case reconditioned logarithmically as recommended in [23]).Depending on the service type and scenario, a more or less sensitive reaction of QoE ratings can be expected upon QoS changes, i.e., the QoS-to-QoE mapping may vary in its steepness.However, when comparing the real and fictively replicated trial with shifted bitrates, different QoE values will be obtained for identical QoS input stimuli during the trial.Thus, the obtained QoS-QoE relations will not match across trials anymore, and we can expect that the maximum QoS of the initial trial will result in substantially higher average QoE ratings than in the rerun. 5Hence, the QoE results must be considered to be local to very specifically defined context, which cannot easily be transferred to a figure that can be used for bringing enhanced network quality solutions to the market.
A similar situation may happen when a different variant of a service is tested, e.g., transitioning from high definition (HD) to standard definition (SD) videos.While the controllability of test routines aims at improving the interpretability of the data, it counteracts this effort by highly limiting the information gain from such empirical QoE trial for the commercialization of network quality.
On the other hand, monetary metrics (such as money used for market or other financial figures) are non-local and linear, and apart from specific contexts have a global and universal validity and value.Hence, the context-specific service experience needs to be mapped to a universal metric, which creates a knowledge gap as denormalization functions for QoE results are not yet available in general.
Altogether, these three information gaps affect the understanding and marketing of QoE products: any role context produces highly contextual information, which has an immediate consequence for QoE and WTP (or other utility-related data) that obviously are in different role contexts, and the mapping of corresponding data becomes a non-trivial task (and may require extensive user trials as a direct technique for bridging the different contexts).For instance, a clear mismatch between the appreciation for high QoE and the willingness to spend any money for this kind of service exists.Furthermore, the cognitive dissonance assessment in [28] hints towards the possibility of a direct influence of pricing on the subjectively perceived network quality, indicating a non-trivial (non-linear) mismatch between theoretical QoE tests without price considerations and QoE assessments during purchase situations.For these reasons, a better understanding of pure service valuation or preference is required for projecting QoE results to monetary means.
In the rest of this paper, we propose, analyze and discuss several approaches for resolving these information gaps.We will focus on subjects who provide QoE ratings during a purchase situation while at the same time assessing their quality appreciation, which allows to eliminate the gap between different roles, as in this case both user and customer roles are active during the assessment.The mismatch between the resulting QoE p and QoE is characterized by an interest- ing fixed-problem problem which we address in detail in "Fixed-point models for QoS-and QoE-based charging", where pricing as such creates a negative feedback that has to be incorporated into the quality experience metric.On the other hand, WTP has similar characteristics as QoE p , but provides directly usable data for the ISP (e.g., ISP revenue data can be derived); hence, assessment methods and first results for QoE p and WTP are presented and compared in section "Willingness to pay and user interaction behavior".Furthermore, our work is based on comprehensive empirical testing across services, scenarios and test ranges, which altogether allows a generalization of the results, without, however, automatically getting rid of the problems associated with local normalization.This is only possible if we assume a global human perspective during purchase situation.Here, due to the user-customer role inclusion, a direct relationship to WTP appears realistic to be formalized.Thus, based on the extensive experience with highly controlled QoE trials with locally normalized output data, we are confident that the research community has by now developed the tools and obtained the experience to target QoE p and similar output metrics.

Fixed-point models for QoS-and QoE-based charging
In order to illustrate the basic conceptual difference between traditional charging based on QoS only and QoE-based charging, in [27] we have considered a simple dynamic system where a provider with limited resources offers a service to end customers, see Fig. 3a.While QoS provisioning and pricing is up to the provider, the users are able to decide on their demand, which of course will depend on the price (e.g., according to the classical concept of "price elasticity" [16]).On the other hand, because of resource limitations in the provider's network, the size of overall demand will influence service quality, according to some (more or less sophisticated) mechanism matching demand and supply.Finally, we assume that the QoS level delivered by the operator is reflected in a corresponding tariff structure, and thus in the price eventually charged from the customer-cf.Fig. 3a.
Note that, in this model, the user perspective is restricted to her decision to buy a certain amount of resources for the currently valid price, while on the provider side we assume a monopolistic situation such that there is only one service type offered that cannot be substituted by another service type (with maybe different price associated).In contrast, Fig. 3b depicts our structural model for QoE-based charging, where the additional QoE component has direct impact on the demand and is influenced by the context of the user (service, environment, mood, etc.), and especially the price to be charged.In the remainder of this section, we will analyse both these models step by step.

Charging for QoS: adaptive users
For the formal analysis of the model for QoS-based charging (see Fig. 3a), where users are "non-sensitive" w.r.t.QoE at all and just adapt their demand according to price and QoS, let p indicate the price, d the demand and q the QoS.The resulting dynamic system is described by the following set of equations: where, after appropriately rescaling the boundary values of the functions, without loss of generality we assume all functions to be continuous bijective mappings of the unit interval [0, 1] onto itself.
With respect to the shape of these functions, during this entire section we make the following assumptions: -(A1): d(p) is monotonically decreasing and convex.The monotonicity of d(p) is straightforward, the convexity results from the asymptotic behavior, as for the case of high prices the demand will tend towards zero.-(A2): q(d) is monotonically decreasing and concave.Due to the boundedness of underlying network resources, increasing demand will result in lower network QoS.However, in general any substantial QoS degradation requires a significant reduction of the available network resources while, on the other hand, typically services (especially real-time ones) have some minimum resource requirements (e.g., basic connectivity).Together, this motivates the concavity of the function.-(A3): p(q) is monotonically increasing and concave.
Services should not become cheaper if their quality is improved, while they cannot be sold at all if they become overly expensive.Both conditions together result in the assumption.
Proposition 1 Assume (A1), (A2) and (A3) to be valid.If d(p), q(d) and p(q) are linear, the entire unit interval constitutes a set of fixed points for the system of Eqs.(1)-( 3).

Proof
The normalization together with the monotonicity conditions yield p(0) = 0 , p(1) = 1 , d(0) = 1 , d(1) = 0 , q(0) = 1 and q(1) = 0 .The existence of the two trivial fixed points can be checked in a straightforward manner.Assume now that all functions are linear.Then, for any fixed point price p * ∈ [0, 1] , we have More generally, for non-linear functions these equations generalize towards Hence, if any of the inequalities is strict in the open interval ]0, 1[, then p * > p * , which excludes the existence of addi- tional interior fixed points.□ NB: Further numerical evidence suggests that (0, 1, 0) is unstable and (1, 0, 1) is stable.

QoE-based charging-submodel 1: price-sensitive users
In contrast to the simple model with predefined preferences considered so far, the situation is more complex when the charging mechanism integrates QoE as subjectively perceived by the end customer.Before addressing the full model for QoE-based charging (Fig. 3b), we first analyse two submodels, depending on whether the QoE perception of ("price sensitive") users is based on their expectations due to the pricing plan, or ("quality sensitive") users rather handle prices and QoE as independent parameters which separately influence their purchasing decision (and thus the demand), see Fig. 4.
Starting with the price sensitive submodel, note that, in our context, price plays a double role: on the one hand, like with the QoS-based model, the end customer pays for the level of offered service quality (hence prices are supposed to rise with increasing QoE), on the other hand the price to be paid forms also part of the user context [13] and thus has direct impact on the quality perception itself: the higher a price, the higher also the user's expectations concerning the offered service quality (and, loosely speaking, the larger the probability to be disappointed).Hence, we may postulate that higher tariffs induce the QoE evaluation to deteriorate.
Figure 4a depicts the resulting price-sensitive submodel.Observe that there is a new function involved termed QoE function, which reflects the mentioned janiform role of prices and hence depends on both the QoS level q and the price level p.Consequently, we may formulate a new system of equations where ( 2) and (3) remain identical, Eq. ( 1) becomes and we have an additional equation for the QoE function First of all, we extend our prior set of assumptions as follows: -(A4): p(x) is monotonically increasing and concave.This is in strict analogy to (A3), just replacing q by x. -(A5): x(q, p) is monotonically increasing and concave in q, and monotonically decreasing in p. Better QoS should not deteriorate the associated QoE, while higher tariffs lead to higher quality expectations which lead to lower subjective quality perception.The concavity of the function can be argued in analogy to (A2), as it usually takes a severe degradation of QoS until the user subjectively perceives a noticeable degradation of QoE.
Indeed, for constant (e.g., flat) prices, empirical results indicate that for a broad range of scenarios, QoE depends logarithmically on the offered bandwidth (= QoS) [23], while, for constant QoS, already earlier QoE has been postulated to decrease with rising prices, see [27].The resulting two-dimensional QoE function can of course be of a rather general form-to simplify a bit, we assume both effects to be independent from each other, and hence x to be separable, cf.[27]: We interpret (6) as follows: the QoE depends on both the QoS offered by the provider-in terms of a monotonically increasing quality function x 1 (q)-and the customer expecta- tions triggered by the corresponding tariff (= price), which is expressed in terms of a (monotonically decreasing) expectation function x 2 (p) .Without loss of generality we assume that x 2 (0) = 1 and x 2 (1) = 0 , while x 2 is not subject to further restrictions (in fact, there is little empirical evidence so far about a reasonable shape of this function).
For non-linear functions, similarly to the proof of Proposition 1, we derive that ≥ 1 , and is a product of two continuously differentiable functions and hence continuously differentiable, with x| p=0 = x| p=1 = 0 .Hence, with the product rule we have Eventually, in case any of the concave/convex properties is strict, the inequality in the last equation is strict, hence according to Bolzano's theorem there exists at least one p * * ∈]0, 1[ with x(p * * ) − p * * = 0 which is an interior (non- trivial) fixed point.Finally, if we additionally assume that x 1 and x 2 are concave, then x as in ( 6) is concave itself, and the non-trivial fixed point is unique.
NB: Numerical evidence suggests that in the general case p * is unstable while p * * is stable.

QoE-based charging-submodel 2: quality-sensitive users
For the case of a quality sensitive user, the corresponding submodel is depicted in Fig. 4b, where now the demand function has two input parameters: price and QoE.In analogy to Submodel 1, we assume for d(x, p) also here separability into a quality-driven demand function d 1 (x) and a price-driven demand function d 2 (p): while, in this submodel, QoE depends on QoS only: While (A5) still applies for this x(q) (just assuming independence from p in Eq. ( 6)), the underlying assumption for the new demand function in (7) is motivated as follows: -(A6): d(x, p) is monotonically increasing and concave in x and monotonically decreasing in p.The suggested monotonicity originates from the typical antagonistic demand behaviour: better quality is assumed to increase demand, higher prices decrease demand, see (A1).Concavity in x results in both competitivity and boundedness of the market: while already a small advantage in terms of perceived quality leads to a significant increase of the market share, this effect becomes weaker for larger established market shares.
Proof In addition to Proposition 1, we assume If all functions are linear, we have Hence (p * , d * , q * ) = (1, 0, 1) is the unique fixed point. (

Numerical examples for submodel 1 and submodel 2
For the purpose of illustrating the process of convergence in more detail, consider two simple examples, i.e., a purely price-sensitive user and a purely quality-sensitive user, whose respective quality functions x 1 (q) , see (6), and x(q), see (8), both are assumed to have a logarithmic shape [25] while all other functions are supposed to be linear.Fig. 5 depicts the resulting step-wise convergence behavior where p (i) refers to the i-th iteration of the price towards equilib- rium and p (0) is assumed to be slightly larger than 0 to avoid trivial fixed points.Note that this choice corresponds to the fact that in the user trials described in section "End user convergence behaviour", each experiment has started with the quality level corresponding to the lowest price.
Hence, for the case of the price-sensitive user (Submodel 1), Eqs. ( 4), ( 6), ( 3) and ( 2)-applied in this order-lead to Similarly, for the case of the quality-sensitive user (Submodel 2), Eqs. ( 4), ( 8), (3), (7) and again (4)-applied in this order-result in Note that all functions have been normalized to the unit interval as described previously, hence all linear functions are either the identity function id or 1-id, depending on their slope (positive or negative), while in both cases the respective quality functions are normalized logarithmic functions.
We consider these examples to reflect two very fundamental patterns that are widely observed also in practice, as will be demonstrated in the next chapter.Hence, we will come back to Fig. 5 in the course of section "Convergencebased user classification" as primary inspiration for our user classification approach presented there.

Full model: price-and quality-sensitive users
Turning now to the full model as depicted in Fig. 3b, firstly let us subsume all user-specific impact factors, e.g., due to personality or current contextual state [14], under the overarching notion of a Context Function Ω .In fact, as such context factors are external to the model, we can easily ( 9) Fig. 5 Step-wise convergence (top) and first 10 iterations (bottom) for strictly concave quality functions: quality-sensitive users (left column) and pricesensitive users (right column) NB: Also in this case, numerical evidence strongly suggests these fixed points to be stable.

Conclusions for sensitivity-based user classification
Summarizing the above fixed point (FP) analysis of our models for different user types w.r.t.charging for QoE, Proposition 1 tells that adaptive users, who do not care for QoE at all, go either for free low quality (unstable FP) or expensive high quality (stable FP).Proposition 2 shows that strictly price-sensitive users have an unstable FP at low quality plus ( 14) (y-axis) and shows where this expression equals zero (curved line).Note that, for reasons of better illustration, both -and p-axes are depicted in reverse directions In order to create a unified full model which integrates both submodels as special cases, we may take strong advantage of the product form of x(q, p) = x 1 (q) ⋅ x 2 (p) in (6) and d(x, p) = d 1 (x) ⋅ d 2 (p) in (7), resp., and have to achieve that d 1 (x) ≡ 1 for the case of Submodel 1 (price-sensitive user, cf.(2)), and x 2 (p) ≡ 1 for the case of Submodel 2 (qual- ity-sensitive users, cf. ( 8)).The easiest way to fulfill these requirements is by defining the following system of equations for a sensitivity parameter ∈ [0, 1]: Observe that, for = 1 , these equations coincide with (2) and (6), resp.(i.e., price-sensitive case), whereas for = 0 , they coincide with ( 7) and (8), resp.(i.e., quality-sensitive case).
For any other value of , if all functions are assumed to be linear, the resulting fixed point equation (with ∈]0, 1] ) reads a stable one who is non-trivial.In contrast, Proposition 3 states that strictly quality-sensitive users strongly prefer expensive high quality.Finally, according to the Full Model, mixed users ( 0 <  < 1 ) end up at a stable non-trivial FP which depends on the sensitivity parameter as depicted in Fig. 6.
A closer look to Fig. 6 reveals that the latter dependence on is not linear, but that the FP prices stays within a relatively small interval if is varying around 0.5, whereas the curve is rather steep for close to 0 or 1.Hence, if we assume the sensitivity parameter for cross-sensitive users to be uniformly U]0; 1[ distributed, we may conclude that the FP prices for the majority of cross-sensitive users will concentrate in the middle of the unit interval, whereas both ends (low and high prices, resp.), are only lightly populated.
Altogether, if in general the overall population typically consists of some adaptive, some price-or quality-sensitive and a majority of mixed (cross-sensitive) users, we may expect the resulting distribution of FP prices to exhibit clear peaks for the lowest and the highest prices (unstable part of adaptive or price-sensitive users + quality-sensitive users, resp.), with a "bell curve" in between (cross-sensitive users + stable part of price-sensitive users).For the moment, this hypothesis is a simple prediction resulting from our rather simple models, however, in the next section we will see that it is surprisingly consistent with real-world user behavior.

End user convergence behaviour
After this extensive fixed point analysis of analytical models for QoE-based charging, we now turn towards end user convergence behavior as exhibited in practice.To this end, we refer to the results of three comprehensive user trials, 6subsequently referred to as the 2011, 2012 trial and 2015 trials.The 2011 trial was intended to assess the general readiness to pay for improved network qualities and associated services by studying purchasing behaviours of customers for packet loss-impaired UDP streams.The subsequent and more elaborate study design of 2012 and onwards, however, aimed at providing deeper insights on two aspects: (1) the separation of pricing and quality motives for the studied purchasing decisions, and (2) the identification of the absolute maximum WTP for network quality using a more realistic technological setup (i.e., video streams with various bitrates and prevailing video codecs using TCP transport).These results were compared to a retesting in Vienna (Austria) and Oulu (Finland) in 2015 [18] (i.e., the 2015 trial), where the video technologies were updated from H.264 (2012) to the new H.265 codec (2015), and the setup was reparametrized accordingly.
The subsequent analysis will primarily concentrate on our 2012 trial, but will selectively use the 2015 trials for elaborating on details or validating findings.It further builds on the outcome of the 2011 trial that a general readiness to pay exists for enhanced (video) network quality.In the following subsections, we briefly describe the trial setups, then study the convergence behaviour of end users, and present an approach for their algorithmic classification.

User trial setup and general results
The technical setup of our 2012 user trials is a modified version of the one used in 2011 and described in [28,29], and is reused using a modernised toolset in the 2015 validation testing.The basic setup is depicted in Fig. 7.While in [28,29] we have distinguished between four classes of Standard Definition (SD) video quality based on different packet loss levels for UDP-based transmission, the revised setup aims at being significantly closer to reality.We use the TCP-based adaptive video stream technology HTTP Live Streaming (HLS) in order to adapt video quality to network conditions, i.e., mainly bandwidth.Using high definition (HD) blue-ray quality allows differentiating a large number of different quality levels (17 in our case), based on logarithmically scaled bitrates (H.264 encoding).For crossvalidation purposes, we have included three additional "virtual" quality classes which are identically employing the best possible bitrate but still differ in terms of prices.Hence, the trial subjects have been exposed to a total of 20 offered quality classes, see Table 1.Note that, as the virtual classes Q17-Q19 identically offer the highest available video bitrate at still growing prices, selecting a higher quality class than Q16 seems irrational, which is an essential test design addition in order to reveal dominating factors for the purchasing behavior, i.e., price-/quality-sensitivity.
In the retesting in 2015 [18], due to the newer codec (H.265 instead of H.264) lower maximal bitrates were tested, i.e., 16384 kbit/s.Instead of 20 quality classes, only 8 were used for increasing the sample sizes within each quality class, see Table 2.The proprietary HLS solution was further replaced by the similarly functioning Dynamic Adaptive Streaming over HTTP (DASH) standard [6].Otherwise the technical setup closely follows the 2012 trial.
For the logarithmically increasing bitrates in from Q0 to Q16 (see Table 1) and Q0 to Q7 (see Table 2), respectively, the trial design in 2012 and 2015 used three tariffs: lowtier tariff A, medium-tier tariff B and high-tier tariff C with maximum prices p max of EUR 2.00, EUR 3.00 and EUR 4.00, resp.Between the identical minimum price of zero for Q0 and the respective p max for Q19 an increasing price curve was used which is linear with respect to the classes.Hence, for the highest bitrate 4 different prices were listed in order to test other forms of price discrimination.While using identical maximum prices of EUR 2.00, EUR 3.00 and EUR 4.00, the retests in 2015 only used 8 price steps with corresponding quality steps (i.e., the price curve was not additionally stretched for additional quality classes).
Like with [5] and [28], trial users have been given real money-10 Euro each in our case-which they could freely spend on quality enhancements during the trial or take home afterwards.Together with the mentioned fine granularity of quality classes, this setup allows for observing detailed user interaction behavior.During the trial, each test subject watches three video sequences (each 20 min long) individually chosen from a representative video library (including, e.g., highly topical blockbusters).Starting per default with the poorest quality level Q0 (which remains free of charge during the entire trial), the subjects can use a jog wheel for dynamically and interactively testing the effect of quality adjustments during an initial period of around 5 mins free of charge.Users are always informed about the price of each selection, while the range of available quality levels and tariff designs are intentionally hidden for reducing unnecessary biases.Due to a highly improved setup (an own VLC client fork, plus precise logging mechanisms), quality changes are quickly applied (with a delay of about 1 s only) and tracked with granularity of 1 s.After the free trial phase, the latest selected quality level is taken as final choice, and the corresponding price is deducted from the user's cash deposit.Now, users watch the remaining movie clip without any further interaction in the chosen quality.After the experiments, the remaining deposit is paid out in cash to the subjects, as announced before the trial.
Using a jog wheel as physical user interface for changing between quality classes provides significant benefits, as it employs an intuitive mechanism well-known to all users (for instance from sound volume control).Moreover, it creates the illusion of an infinite number of quality levels (of course, in reality requests for lowering the quality below Q0 or above Q19-or Q7 as in the 2015 case-have simply been ignored).In this way, it was possible to record (with temporal granularity of 1 s) user behavior both in terms of selecting quality classes as well as in terms of their convergence behavior, as indicated for instance by the number of "trend" changes between increasing and decreasing the quality (driving the jog wheel up and down, resp.).
Overall, 43 test users (12 male, 31 female) have participated in our 2012 trial, 40 of whom have completed successfully, with three movies for each user.Our number of participants is well in line with typical standard sizes for user trials (e.g., VQEG methodology: 24 valid correlating subjects, ITU-T Rec.P.910: 4 to 40 test users).Most of them (26 persons) were between 18 and 30 years old, 10 persons between 31 and 45 years, and 7 older than 45 years.38 participants had a higher school or university degree, most of them were employed (16) or students (16).8 persons were married or living in a relationship.Only 3 users had previous   experience with charged video on demand (VoD) services, with unsystematic expenditures varying from 3 EUR to 8 EUR per movie.The validation trial in 2015 was divided in two regional cases using identical tooling and almost identical video contents: the Vienna and the Oulu campaign.In Vienna, 22 test users have participated, of whom 41% were female and 86% had graduated from a university before.Most of the subjects where between 20 and 29 years old (11 persons), 2 persons were between 10 and 19 years, 6 persons between 30 and 39, 1 subject was between 40 and 49, and 2 subjects were older.Their experiences with VoD services was still limited (32% with subscription; 55% had purchased video contents online before).In Oulu, 19 additional test users participated, of whom 21% were female and 90% had graduated from a university.Most test users were between 30 and 39 years old, 2 persons were younger, 7 persons were older.68% had seldomly purchased video contents online before, 58% had a video service subscription, and 32% did not use any VoD service at all.
Users were randomly assigned to tariff schemes A, B, C as introduced above, however, due to the limited sample size, only a representative subset of the potential tariff permutations have been tested, see Table 3 for details.Note that Group 1 addresses the case of monotonically increasing prices, while Group 2 experiences decreasing prices.For the control group, prices are kept constant on medium level B for the first two movies, while the third movie is randomly assigned to either price plan A or C. In this way, sample sizes for the three tariffs are kept in balance, while the tariff distribution allows further analyses, cf.[35].To increase the sample sizes within each group, Group 2 was eliminated in the validation testing in 2015.Moreover, the reduced number of quality classes also increases the sample sizes per quality class.
Before we address an approach for user classification based on their convergence behaviour in more detail, let us have a look at some interesting general results.To start with, Fig. 8a illustrates the distribution of the quality classes eventually selected by the users at the end of the free trial phase.Most notably, the nature of this empirical distribution confirms precisely the hypothesis following from our fixed point analysis: as predicted in "Conclusions for sensitivity-based user classification", we observe clear local maxima at both ends together with a (skewed) bell shape in between.
Complementing this result, Fig. 8b right depicts the distribution of quality/price level changes of the 2012 trial (typically between 10 and 50 per movie) during the free 5 min trial phase, while Fig. 9a shows the distribution of the corresponding number of "trend changes", i.e., ups and downs until the user decision converges towards a final value.Here, the majority of test subjects restrict themselves to a maximum of 5 trend changes, while a few undetermined test subjects were exhibiting a total of 25 or more trend changes.
Another interesting issue concerns the points in time when quality changes happen.Figure 9b for the 2012 trial (the 2015 results are structurally similar) indicates that indeed most quality changes have been undertaken towards the end of the free trial period.This seems counterintuitive, as test users always start with Q0 (which, by the way, introduces a certain bias we have to be aware of) and have to increase the quality until the desired level is reached.However, the last few seconds of the trial period determine the quality for the rest of the movie, hence the aggregation of those anticipated future requirements may serve as an explanation of this particular behaviour.

Convergence-based user classification
Having confirmed our hypothesis about a sensitivity-based user classification, we now present and discuss a quantitative approach for user classification, based on the assumption that users behave consistently during the entire trial, i.e., follow the same quality/price selection strategy for all movies.While this cannot be claimed to be universally valid (for instance, due to learning effects), for our purposes we may consider it a useful working hypothesis.
First of all, remember that the results of our model-based analysis in section "Fixed-point models for QoS-and QoEbased charging" as sketched in Fig. 5 suggest two fundamental user behavior patterns: either a damped harmonic oscillation (quality-sensitive users) or a steady increase towards the ideal price/quality level (price-sensitive users).While these two patterns strongly remind us of the two basic options (underdamped vs. overdamped case) for damped harmonic oscillators depending on the parametrization of the underlying second-order differential equations, we may simply interpret the latter pattern as a somewhat "degenerate" version of the former one, where the equilibrium price/ quality is approached only from below, instead of turning an initially monotonic into an oscillatory behavior at all.Henceforth, in general we may characterize user behavior by two parameters: (1) a certain amplitude of quality/price selection, and (2) the speed of convergence towards the final choice.Within the resulting two-dimensional space, we end up distinguishing three different user classes labeled as "F" (fast convergence), "R" (regular convergence), "S" (slow convergence), which are illustrated in Fig. 10.Note that we have to supplement these three "regular classes" by another class "X" consisting of users with irregular behavior, whofor whatever reason-cannot be placed in one of the regular classes.For further illustration, we depict a couple of typical examples for the resulting classes which are taken from both Groups 1 and 2, where the different colors refer to the tariffs introduced earlier (green: A, blue: B, red: C).The hereinafter conducted assessment is centred around the 2012 trial data due to its most comprehensive data due to a higher sample size and the most fine-granular tracking of user interaction behaviors.

Type "F": fast convergence-small amplitude
The first general type of users is characterized by a relative consistent convergence behavior: users climb up the quality ladder until reaching the targeted quality, and stay there without major changes (see Fig. 11 7 ).

Type "S": slow convergence-large amplitude
Contrasting the examples depicted in Fig. 11, a second user type may be characterized exactly by the opposite behavior, i.e., large amplitude and slow speed of attenuation, see Fig. 12 (observe that the change frequency may be quite different, while the overall convergence speed seems roughly comparable).

Type "R": regular convergence-medium amplitude
The third user type can be described as somewhat in between types F and S. It is characterized by large to medium amplitudes as well as medium speed of convergence, see Fig. 13 for typical examples.Observe how, in the first example, the behavior during the first movie is different from the others and more exploratory, suggesting a kind of learning behavior.

Type "X": free riding and irregular user behavior
Finally, also some sporadic cases of free riding (Fig. 14 left) and further irregular user behavior (see, e.g., Fig. 14 right) have been observed (class "X").Note, however, that these cases account for only around 15% of the overall number of samples.

Aggregated convergence behavior
In order to capture the aggregated convergence behavior for user i, as described in [27] we define the root square deviation (RSD) σi (t) as a function of time t with respect to the convergence value xi = lim t→∞ x i (t) = x i (300) as follows: Moreover, define as reference root mean square deviations (RRMSD) of classes F, R and S, respectively.Then, for user i and class k, represents the average square difference between user i's RSD and the RRMSD of class k over the selection period (with duration 300 s).Note as a general remark that class F is more coherent (i.e., to the RRMSD) than class R which itself is more coherent than class S, i.e., in terms of expectation values per class: Hence, for our purposes we heuristically assume (F) i ≤ 1 , 1 <  (R) i ≤ 2 and 2 <  (S)  i ≤ 3 .If user i does not belong to any of these three "regular" classes, she is put into class X.In this way, irregular behavior (which accounts for less than 15% of the cases) may easily be identified by an excessive size of the metric defined in (17).
In this way we were able to classify 39 out of the 40 relevant participants of our trial correctly.The algorithm failed only in the case of one user who was converging to her final choice (maximal quality) within 32 s without later changes, resulting in a significant average deviation from σF (t) .The distribution of the other 39 participants together with the corresponding mean and standard deviation values for (k)  i is depicted in Table 4 (note that for class X, we have averaged mean and standard deviation over the differences to all three classes).
Figure 15 depicts the three RRMSDs described in (16) (grey lines) together with the average value of σi (t) according to Eq. ( 15) per class.We conclude that the convergence metric defined above together with the classification algorithm allows a sufficient distinction between the "regular" classes F, R, S. In total, around 85% of our user population exhibit convergence behavior towards a fixed point as suggested by the mathematical analysis presented in "Fixed-point models for QoS-and QoE-based charging", while almost every second user approaches her final choice on a track which is best described as negative exponential (with different speeds of convergence between classes F and S).

Willingness to pay and user interaction behavior
Having discussed user classification in some detail, we will now concentrate on monetary aspects of our trial results.First of all, observe that users' willingness to pay unfolds  an overall median spending over all test subjects and tariffs ranging between EUR 1.29 and EUR 1.71 for the respective trials-see details in Table 5.While there were users spending the overall maximum (EUR 4.00) as well as others spending nothing, for further analysis we focus on two different aggregations: Aggregation per round All movies shown at the i-th round ( i = 1, 2, 3 ) are aggregated to measurement M i , see Table 6.Note that, while overall expenditures do not exhibit significant differences, slightly lower spendings and SD values are observed for M 2 .Moreover, none of the subjects with tariff C have purchased Q19 during M 1 , while this has changed in subsequent rounds.Hence, we conclude that average purchasing behavior over time is considered rather stable, while individual behavior over time (from M 1 to M 3 ) may change.
Aggregation per tariff Studying each tariff independently, we observe that the median expenditure monotonically increases from EUR 0.74 to EUR 1.26 on the way from tariff A to C, indicating that absolute price increases raise absolute revenues.On the other hand, if expenditures are normalized to the unit interval (i.e., for each the maximum possible expenditure is scaled to 1), resulting relative expenditures exhibit a (light) downwards slope.
Not unexpectedly, higher prices also increase the SD significantly: in the normalized data, fluctuations around 26% are observed, while a higher absolute SD suggests that higher prices trigger a broader reaction concerning possible pricing strategies.All in all, taking into account the non-linearity of absolute expenditure increase and the complex purchasing behaviors on a normalized scale, we conclude that the end user's willingness to pay does not increase linearly with respect to the tariff level, but exhibits a more complex dependency [35].With the normalized data set, it turns out that around half of the test subjects decided to purchase quality levels priced between 20 and 40% of the respective maximum, while a significant number of users went for medium levels between 40 and 60% of the maximum.Local peaks are to be found also at 0 and 100%; in addition, we observe a non-negligible number of purchases for the virtual quality levels (Q16-Q19), suggesting the existence of a noticeable premium segment where users seek for excellent qualities irrespectively of price.The broad majority, however, appears to be rather price-sensitive.
Complementing the convergence results presented in "End user convergence behaviour", we may also consider aggregated user behavior over time depending on the eventually chosen quality/price level.To this end, we again distinguish quality-sensitive users (paying eventually more than an average user) from price-sensitive users (paying less than the average).Figures 16, 17 and 18 depict the aggregated convergence behavior of these two groups for the 2012 trial and the 2015 trials in Vienna and Oulu, respectively.Observe that, due to the complete reimplementation of the 2015 trial, there are some data differences to be acknowledged: the 2012 trial recorded the selected quality every second (i.e., inputs over time in seconds), while in the revised setup all inputs have been recorded (i.e., sequence of inputs).
In most cases, 8 we observe a relatively steep ascent during the first minute, where quality is the driver while users have to overcome the unacceptable initial quality condition Q0.Then, the price becomes the limiting factor and leads to an (aggregated) equilibrium state (while, of course, individual users still continue changing the quality levels, see Fig. 8b).The effect is especially pronounced for the pricesensitive users that quickly change to cheap and low-quality offers.Thus, we may conclude that end users are going for an optimal final quality level based their maximum willingness-to-pay, which can be interpreted as an active decision for an acceptable balance between price and quality, potentially involving cognitive dissonance [29].For further details on this aggregated analysis for the case of the 2012 trial we refer to [35].
As another interesting side remark, observe that Table 7 shows how absolute expenditures have been significantly increasing along with maximum prices in all trials (i.e., from A to C).On the other hand, normalized expenditures slightly decrease in most cases (especially during the transition A → B ) before stabilizing ( B → C ).This may have two reasons: either end users are not willing to accustom to lower quality levels, or effects of market entrance pricing are observed [35].For instance, a low price in the first round may trigger lower subsequent expenditures (as users are avoiding luxury purchases), whereas high initial prices do not impact the users' purchasing behavior.However, if certain minimum quality expectations are about to be underrun, users avoid opting out (which the trial design does not allow anyhow).This illustrates different user mindsets with respect to price vs. quality sensitivity-see [35] for further details.On an aggregated level, this confirms that subjects are both price and quality sensitive in video quality markets with explicitly shown price tags, which classifies a resulting representative aggregate user into the "QoE-based charging-submodel 2" (see section "Fixed-point models for QoS-and QoE-based charging").However, the high standard deviation (e.g., see Table 6) and the previous findings in the 2011 user trial [28,29] indicate that multiple customer segments exist, where the latter work finds quality-optimising, price-optimising and mixed qualityprice-optimizing subjects, as well as strategically acting subjects seeking for bargains on the market.

Summary and conclusions
In this paper, we discuss several fundamental chargingrelated aspects of the current paradigm change from Quality of Service (QoS) to Quality of Experience (QoE), which significantly extends work on this topic.Based on a comprehensive analysis of context gaps, we have proposed and analyzed a fixed point model for QoE-based charging, whose specific complexity comes from the fact that it considers both price sensitivity and quality sensitivity of users.Here, the role of prices becomes twofold: they serve as expression for the received service quality, and at the same time they may significantly impact QoE   evaluation.The applicability of this model is illustrated with results from a series of user trials of QoE evaluation and charging for video streaming.Our advanced trial setup allows users to choose in real time between a broad range of fine granular quality classes of an HD video subject to different price plans, and allows validating also our approach for user classification based to their convergence behaviour.Together with our quantitative investigation of user's willingness to pay (WTP), these empirical results confirm the necessity to consider price-and quality-sensitive user groups, and additionally suggest to separate charging for QoE from QoE perception itself.Hence, the fixed-point analysis illustrates the difficulty of directly characterising the QoE market from QoE ratings that have been obtained outside of purchasing situations, while the empirical WTP data further support this perspective by sketching a relationship for quality-and price-based network service differentiation that has no direct relationship to classical, for example, logarithmic QoE curves.This illustrates the remarkable influence of pricing on consumer decisions and their appreciation for network quality, which, however, also effects the communication to and with end customers.
This paper has aimed at laying grounds for various directions of further work.As far as the model is concerned, we are currently performing a detailed stability analysis of the resulting fixed points.Another important open issue concerns the parametrization of the various functions involved, which requires further highly focused user trials.Additional trials are also necessary to clarify the separability assumptions expressed in Eqs. ( 6) and ( 7), as well as our assumption on the consistency of user convergence behavior over time.Furthermore, the connection between charging and   perceived quality is obviously also relevant for a broad class of network-based services beyond video streaming, like, for instance, web access, VoIP, future cloud services etc.Finally, it would be very interesting to transfer our settings towards large-scale empirical studies, for instance in the framework of a dedicated field trial involving real customers and services.This would allow to overcome the methodological limitations of laboratory based experiments and to gain additional valuable insight to which extent users are indeed willing to pay for what they get in terms of quality experience.
Last, but not least, another open issue concerns the establishment of efficient frameworks for selling quality-differentiated network services to consumers.Historically, for Internet resources often the well-known concept of service level agreements (SLA) has been employed as a reference, which is however problematic for QoE products as discussed here [32], because around QoE (as the notion already suggests) only experience products [20] can be formed, i.e., products whose value can be appreciated by consumers only a posteriori.For this reason, the definition of new concepts may be required, for instance in the form of Experience Level Agreements (ELA) as initially discussed in [32], where the beneficial characteristics of SLAs meet the experience aspects required for the context of QoE products.So far, this idea has remained on a conceptual basis, and given the current lack of interest on the operator side in providing end-user SLAs, its relevance seems still somewhat limited at the moment, but further work into this direction might still become another important building block for a successful commercialization of QoE.

Fig. 2
Fig.2Mismatch between QoE contexts: a small HD parameter range is contrasted with a global alternative and with an SD-based video QoE test[33].MOS values (y-axis) below 1 and above 5 are considered out of range

Fig. 3
Fig. 3 Charging for network quality

5
Page 8 of 20    As in the proof of Proposition 2, for the general case we have and hence, withd 1 (p * ) ≤ 1 − p * ≤ d * and d 2 (x) ≤ 1,Thus, if any of the inequalities is strict on the open interval ]0, 1[, this leads to a contradiction, hence no additional nontrivial (interior) fixed point exists in this case.□

5 Proposition 4 □Figure 6
Figure 6  depicts f (p, ) for (p, ) ∈ [0, 1] × [0, 1] with = 1 and = 0.85 , illustrating existence and uniqueness of the fixed point in two typical cases.Observe that, independently of , = 1 results in the price p = 0 derived in Proposition 2 for price-sensitive users, whereas the high resulting price for = 1 is characteristic for quality-sensitive users according to Proposition 3.NB: Also in this case, numerical evidence strongly suggests these fixed points to be stable.

Fig. 6
Fig.6 Fixed point characterisation for the full model according to(13): left for = 1 , right for = 0.85 ).The saddle-like surface depicts the left side of Eq. (13) depending on price p (x-axis) and

Fig. 7
Fig. 7 Technical setup of the user study Quality/price changes

Fig. 8
Fig.8 Histograms on the user interaction with our quality market (2012 trial)

Fig. 9 10 Dimensions of user classification 5
Fig.9 Temporal histograms on user interaction with our quality market (2012 trial)

Table 1
Quality classes (in kbit/s)

Table 3
Tariff assignment

Table 5
Absolute expenditures (in EUR and as % of p max )

Table 6
Expenditures per round (in EUR)

Table 7
Expenditures per tariff (in EUR)A: p max = 2.00 B: p max = 3.00 C: p max = 4.00