Inattentive consumers in markets for services

In an experiment on markets for services, we find that consumers are likely to stick to default tariffs and achieve suboptimal outcomes. We find that inattention to the task of choosing a better tariff is likely to be a substantial problem in addition to any task and tariff complexity effect. The institutional setup on which we primarily model our experiment is the UK electricity and gas markets, and our conclusion is that the new measures by the UK regulator Ofgem to improve consumer outcomes are likely to be of limited impact.


Introduction
This paper presents an experiment trying to identify whether, in markets for services, consumers are likely to stick to defaults and achieve suboptimal outcomes, and whether inattention is likely to play a role in explaining why this happens together with the more traditional explanation of complexity of the decision problem. In order for consumers to reap benefits from competition, they have to be actively engaged in spotting the best deal that is available to them. This is true both in the tautological sense that they are worse off if they go for a suboptimal choice and in the less obvious sense that firms may be under less competitive pressure if they do so (Giulietti et al. 2005). It is a stylized fact however that, in a number of liberalized services markets and countries where choice is possible, consumers do not switch service providers even though the tariffs they are holding are suboptimal (Jamasb and Pollitt 2005;OFT 2008;Sanco 2010;Lunn 2011); furthermore, when choices are made, there is a question mark about whether they are necessarily optimal (Joskow 2008;Wilson and Waddams Price 2010). Relevant services markets include both ones that have always been in the hands of the private sector, such as bank account, mobile telephony and internet services, and ones that have been opened up to competition in many countries, such as consumer electricity and gas services, telecommunications services and bank accounts (e.g., Ofcom 2009a;Xavier and Ypsilanti 2008;Lunn 2011;OFT 2008).
Undoubtedly, financial switching costs can act as a partial deterrent to changing services supplier in some cases. Identifying the role of different kinds of switching costs can be hard with field data, though important attempts have been made with survey data (Wilson and Waddams Price 2010) and very little switching, compared to the savings opportunities available, is observed even in markets, such as the UK retail electricity and gas markets, where financial switching costs are minimal. Attempts have been made to use survey data to infer non-financial reasons for not switching: the role of complexity in the tariffs employed and in the number of the tariffs employed has been claimed (e.g., Lambrecht and Skiera 2006;OFT 2008;Garrod et al. 2009;Sanco 2009;Lunn 2011) and has driven policy recommendations (e.g., Joskow 2008;Xavier and Ypsilanti 2008;Ofgem 2009bOfgem , 2011; Independent Commission on Banking 2011). For example, it has been brought as a good reason for why the drive for liberalization of consumer energy markets has halted in USA (Joskow 2008) and for envisaging requiring tariffs to be simpler in the UK (Ofgem 2011). Carlin (2009), Gabaix and Laibson (2006), Spiegler (2006) and Ellison and Ellison (2004) provide models explaining how complexity and confusion inducing strategies may be desirable for firms.
The potential role of inattention in explaining suboptimal consumer outcomes has been mentioned, but is, in comparison, somewhat understated. 1 Yet, we suspect that, as with the inattentive agents of Sims (2003), real life time constrained consumers may simply not pay attention to tasks regarding the choices of services. Putting it simply, it may not be in their minds in the way in which saving 20 cents at a supermarket buying groceries is. 2 The key contribution of this paper is to build on this intuition.
Survey data are clearly useful as they directly refer to real life choices. That said, when it comes to understand whether complexity or inattention genuinely explain non-switching behavior and hence potentially justify policy interventions, they suffer from a number of limitations, and this makes unclear the role of complexity as well as the role of inattention. 3 The limitations include a difficulty to draw clear conclusions because of a range of alternative and undeclared factors, 4 forgetfulness and selectivity in recall, 5 the unconscious nature of many of the choices that people make, 6 and/or the need to self-justify past choices towards those conducting the survey or indeed to engage in self-deception to rationalize possibly suboptimal choices that one has done in the past. 7 A specific problem lies with the fact that, if a significant part of the suboptimality of consumer behavior is because consumers do not pay attention, drawing attention of survey responders to issues they have not thought themselves of before may not be the best way to identify the extent to which inattention is a problem. This is because in this case ex post rationalizations may be unavoidable and survey responses may underestimate the role of inattention.
Our paper addresses these issues by using an experimental methodology. Our first goal is to verify whether, in the absence of financial switching costs and using the stylized environment of the UK electricity and gas markets as a benchmark, we can identify a lack of switching and suboptimal outcomes when switching does take place. The second goal is to get a better understanding of why suboptimal outcomes take 2 As supermarket shopping becomes increasingly an online shopping experience with default consumer baskets from previous purchases, supermarket shopping might arguably itself become more sensitive to inattention problems. There are a number of models of economic behavior incorporating inattention, such as Hong and Stein (2007), Hirschleifer and Teoh (2003), Gabaix (2011) and Woodford (2012); Footnote 2 continued DellaVigna (2009) contains a review of some of the implications. The usual interpretation of inattention is in terms of lack of consideration of some features of a product. Inattention could however be in relation to a whole task. place. We test the role of complexity, which we decompose as complexity in the relationship between prices and quantity (linear vs. non-linear tariffs), in the presence or absence of bundling (single product vs. dual products tariffs), and in the number of tariffs. 8 We also test, more innovatively, the role of consumer inattention by suitably developing a methodology based on the presence of a default tariff and an alternative task, and we consider two possible alternative tasks across different treatments. This helps us to evaluate policies putting limits on the number and type of tariffs such as the regulatory constraints on complex tariffs recently proposed or just implemented by the UK regulator Ofgem (2011Ofgem ( , 2014. The paper that is closest to ours is Friesen and Earl (2013), who look at the choice of complexity in the context of mobile phone plans, but their focus is on usage uncertainty and they do not look at the role of inattention (and so they do not have default tasks or an alternative task).
The institutional setup on which we primarily model our experiment is the UK electricity and gas markets. These are mature markets liberalized since 1996-1999 and comparatively simple in terms of the product they offer (energy). They are also comparatively transparent markets with a wide availability of online search and switching websites. 9 These websites enable both the identification of the best tariffs for any given level of consumption and easy switching of service provider at the click of a mouse. Tariffs can be either for electricity only, or for gas only, or they can be dual tariffs bundling together both electricity and gas; our experiment will focus on electricity only and dual tariffs. 10 The number of tariffs in the market is large: as an illustration, when we collected data for our experiment, we found as many as 72 electricity and 80 dual tariffs in the London, UK, energy market using an online website. 11 As for other services, consumers tend to stick to their status quo in terms of energy supplier (NERA 2003;Ofgem 2009a;Ofgem 2011;Behavioural Insights Team 2011): this acts as their default choice. For example, only 18 % of all respondents to a Ofgem consumer survey switched electricity supplier in 2009, and only 17 % switched gas supplier (Ofgem 2010). Furthermore, when switching takes place, the best (cheapest) tariff is often not chosen. Using data from 2005 and 2000 surveys, Wilson and Waddams Price (2010) estimated that only 8 to 20 % of consumers opted for the best tariff given their annual consumption levels.
Our key finding is that tariff complexity and the number of tariffs matter, but that inattention matters as well. Regulatory measures to reduce complexity are likely, therefore, to be of only partial value. Sections 2 and 3 describe the experimental 8 Kalayci and Potters (2011) have an interesting experiment where sellers choose product complexity, in terms of number of attributes of an abstract product, and find some evidence of consumer exploitability, though subject to consumers having to make decisions within 15 s; in an experiment again on product complexity (with products modeled as abstract lotteries) but no time constraints, Sitzia and Zizzo (2011) find some qualified (though only qualified) evidence of consumer exploitability. Unlike these experiments, we consider tariff complexity, number of tariffs and product bundling, and we employ tariffs mapped up from a real world markets. Also, in treatments with time constraints subjects do have anyway plenty of time to decide (2 min), as verified against a control treatment without such time constraint. 9 Examples include http://www.which.co.uk/switch/, http://www.uswitch.com/, http://www.gocompare. com/gas-and-electricity/ and http://www.confused.com/gas-electricity. 10 A positive correlation between switching electricity and gas has been found (Giulietti et al. 2005). 11 Ofgem (2012) contains estimates, as do earlier Ofgem retail market reviews. design. Sections 4 and 5 present the descriptive statistics and the regression analysis, respectively. Section 6 provides a discussion and Sect. 7 concludes.

Experimental design: basic features
We ran the experiment at our university in 2011 and 2012. Before the beginning of the experiment, subjects read the instructions and completed a questionnaire with the purpose of checking they had understood the tasks. If they had any doubts they could ask for clarification. The experiment involved individual choices where subjects had repeated opportunities to choose among a set of tariffs. There were 36 rounds. In each round subjects were asked to choose one among 24 different tariffs. Once they had chosen their preferred tariff they were asked to choose a consumption level for that tariff (five levels of consumption 12 were allowed: 1,000, 2,000, 3,000, 4,000 and 5,000 units). Once they had done that, the earnings for that tariff and level of consumption were displayed on the screen.
Revenues were exogenously given in a table for each level of consumption. The higher was the level of consumption the higher were the revenues. Costs depended on the tariff chosen and the level of consumption. The higher was the level of consumption the higher were the costs. Earnings were then calculated subtracting the costs from the revenues. The revenues for each level of consumption are presented in Table 1. They were such that for any tariff the optimal level of consumption was 4,000 units. 13 In all treatments subjects were assigned a default tariff. They could either stick to that tariff or have a look at the other tariffs and change it if they wanted to. At the end of the experiment, one of the 36 tasks was chosen randomly and subjects were paid according to the choice made in that task. 14 Average earnings were around 20 pounds. The experimental instructions and details on all tariff tasks are in an online appendix.
The tariffs. In February 2011, we collected all the electricity and gas tariffs available in the UK market as available to a London consumer using the "Which?" website. The tariffs ranged from simple ones with one tier (i.e. a single marginal price) to more complicated ones with two tiers and a ceiling (i.e. a marginal price and, once consumption exceeds a ceiling, a second and lower marginal prices) or a standing charge and one tier (i.e. a fixed price plus a single marginal price). The tariffs in our experiment were partly real tariffs collected in this way and partly derived (i.e. created) by us using the same structure as the real ones (derived tariffs in what follows). The process of selection and derivation of all tariffs, as well as the full list, is described in 12 Actual consumers of course do not have pre-defined possible levels of consumption. By having only five levels, we wished to keep things as simple as possible in this part of the experiment, however, bearing in mind that actual consumers do have past consumption as a guide to future consumption, and so the level of consumption is not that much of an issue. 13 The average yearly household electricity consumption in the UK is around 4,000 kwh. The gas consumption is approximately four times this amount; in the experiment, we scaled this down by a factor of 4 for simplicity. 14 In all treatments, subjects could use a calculator on the computer screen to help them with their choices of tariffs and consumption levels. The calculator had four boxes for inputting consumption levels and the values of tier 1, tier 2, ceiling and standing charge of the tariff they wanted to check the cost of. Values are in experimental points. Earnings in experimental points were equal to revenue associated to a given consumption level minus the tariff cost of such consumption level. The tariffs cost was equal to Tier 1 × Consumption for simple tariffs; to Tier 1 × Consumption (for Consumption ≤ Ceiling) + Tier 2 × (Consumption-Ceiling) (for Consumption > Ceiling), for complex tariffs; to Standing charge + Tier 1 × Consumption detail in an online appendix. We employed 144 tariffs. Two-thirds of the tariffs were real and one-third was derived. The real tariffs were half for a single service (these are electricity tariffs) and the other half were dual tariffs (both gas and electricity). The derived tariffs were all dual ones. Subjects were only told that the tariffs related either to one good or to two goods (labeled as good A and good Baseline). Table 1 shows a sample of tariffs used. The difference between the best tariff and the second best tariff was always at least 3 pounds.
The default tariff. The default tariff was always a derived tariff designed in such a way that it was never the best to maximize earnings. The difference between the default tariff and the best tariff was usually at least around 6 pounds.
Nature of the tariffs employed in each task. The order of the 36 tasks was randomized.
• Number of tariffs. Half of the tasks employed 4 tariffs and the other half 24.
• Complexity of tariffs. 1/3 of the tasks only involved single real tariffs, 1/3 of the tasks only employed dual real tariffs and 1/3 of the tasks only employed dual derived tariffs. • Mix of tariffs. 1/4 of the tasks only employed simple tariffs (both single and dual tariffs with only one tier); 1/4 of the tasks involved only complex tariffs (both single and dual ones with either two tiers or one tier and standing charge); the rest involved a mix of both simple and complex tariffs (half of each).

The experimental treatments
Treatment baseline (30 subjects). This treatment had a default tariff and a search engine. In each task subjects were shown the default tariff in a first screen; from this screen subjects could either stick to the default tariff or go to a second screen where they could see all the tariffs involved with the default tariff highlighted (see Figs. 1 , 2). For each task, if subjects did not make a choice within 2 min, they were assigned the Treatment web (alternative internet task -30 subjects). This treatment involves two different tasks displayed in two different screens. In one computer screen subjects had the tariff task implemented as in the baseline treatment. In another screen they could browse the web but were not paid for that, in fact only the tariff task was incentivized. If in any given period they did not make any active choice in the tariff task within 2 min, the default tariff was selected for them; they were then required to select their consumption level.
Treatment grid (alternative counting task -50 subjects). In this treatment, subjects again had two screens in front of them. In one they could perform the tariff tasks.
In the other, they could perform a counting task consisting in counting the 1s in 0-1 grids (see Fig. 3). This is a task deemed unpleasant enough in the real effort literature (as in Abeler et al. 2011) as to be considered a good measure of real effort. In our Fig. 2 The Tariff Task Screenshot in the Baseline, Web and Grid Treatments experiment, and as made clear in the instructions, it was also entirely unincentivized, which means that subjects should have ignored the task and focused entirely on the tariff tasks, on which their earnings depended exclusively. By comparing performance in the Web and the Grid treatment, we can verify whether the nature of the alternative task matters for our results. Treatment salient (alternative salient counting task -50 subjects).This is the key treatment of our experiment. We employed the same counting task as in Grid, but the grid was now placed on the first screen of each task (see Fig. 4). On the same screen subjects also saw the default tariff and so, if they wished, they could choose this tariff in this screen and move straight to the consumption page. Alternatively they could opt to see all the tariffs involved in the task and select the tariff of their choice as usual.
Using the language of Zizzo (2010), our experimental manipulation deliberately employs a purely cognitive experimenter demand effect as an experimental tool to make subjects pay attention as a default to the counting task. 16 We would argue that, Fig. 3 Alternative Counting Task in the Web and Grid Treatments even with this purely cognitive experimenter demand, the tariff task is likely to be more salient in the experiment than going to a switching website can ever be for real world households. As a result, our inattention manipulation is likely to simply provide lower bounds on the kind of effects that inattention may produce in the real world. The comparison between performance in the Grid and the Salient treatments will be especially useful in isolating this effect as the alternative task is the same in the two treatments. As a result, a preference for the alternative task would not be able to explain any differential performance between the two treatments. 17

Overview
We classify subjects' choices in three ways: subjects choose the best tariff, subjects stick to the default tariff and subject switch to a suboptimal tariff (i.e. a tariff that is not the best tariff). We define the default rate as the percentage of times subjects stick to the default tariff, and the suboptimal switching rate as the percentage of times subjects switch to a suboptimal tariff. The suboptimal outcome rate is therefore defined as the sum of the default rate and suboptimal switching rate. If we define the optimal outcome rate as the percentage of times a subject choose the best tariff, then the suboptimal outcome rate plus the optimal outcome rate is equal to 1. We will focus on the suboptimal outcome rate and its two components, default rates and suboptimal switching rates. Table 2 presents default rates, suboptimal switching rates and suboptimal choices for the four treatments.

The role of inattention
Treatments Web, Grid and Salient verify the impact of inattention on consumer behavior in particular when compared to the baseline treatment. As previously noted, inat- tention is difficult to study in an experimental setting because there is a natural bias that subjects have (in coming to the lab) to do something; this is different from households not paying attention to specific tasks, such as choices of services, because it is not part of their weekly, monthly or yearly routines. Table 2 and Fig. 5 present default rates, suboptimal switching rates and suboptimal choices for the all treatments that implement an alternative task. Let us start by noticing that default choice rates in the treatments that implement alternative tasks are all higher than default choice rates in our baseline treatment (Mann-Whitney p < 0.001 in all cases). 18 The difference is smaller for the Web and the Grid treatments than for the Salient treatment.
To remind the reader, the only difference between Web and Grid is the nature of the alternative task. In the Web treatment the alternative task is the internet task. In the Grid treatment the alternative task is a counting task. Both tasks are displayed on a different screen from the tariff task. Table 2 shows that in both treatments suboptimal outcome rates are about 50 %, roughly equally split between default choice rate and suboptimal switching rate. We observe no differences between the two treatments.

Remark 1
The default rate and suboptimal switching rates, and consequently the suboptimal outcome rate, is not different between the Web and the Grid treatments. There is no support for the nature of the alternative task making a difference.
We do observe a significant increase in the default rates when we compare the Salient and Grid treatments. The only difference between them is the saliency of the alternative counting task. In the Salient treatment, the counting task is shown on the same screen as the tariff task. Table 2 shows that that the default choice rate jumps up to 46 % in Salient (Mann-Whitney p < 0.03). The suboptimal switching rate is lower in Salient though the effect is marginal or insignificant (Mann-Whitney p = 0.09). Overall, 63 % of outcomes were suboptimal in Salient, against 50 % in Grid (Mann-Whitney p = 0.05). The different default rates between Grid and Salient can be interpreted in terms of inattention. Further support for this interpretation is provided in Sect. 5.

Remark 2
The default choice rate is significantly higher in the Salient than in the Grid treatment, and three times as large as in Baseline. Overall suboptimal outcomes go up by 20 % in Salient relative to Baseline. Inattention matters. Table 3 reports several other variables (other than average earnings) that can be used to gauge the extent of inattention in the Salient treatment. Specifically, the variable Search Engine shows the proportion of times subjects used the search engine. The number of times that subjects have ended up with the default in the first screen (that is the screen where only the default tariff is displayed, or the default tariff and counting task) is around 40 % in the Salient treatment, about 20 % in the Grid and Web treatment and much less (8 %) in the Baseline treatment. A sign of the effect of inattention to the tariff task is in the proportion of subjects who ended up in the Salient treatment relative to the other treatments with the default tariff in the first screen because they did not make an active choice: this proportion was around 17.5 % in the Salient treatment and less than 5 % in the others. Another sign of the effect of inattention is the number of counting tasks that subjects did in the Salient treatment relative to the Grid treatment. They did more than three times as many counting tasks on average in the Salient treatment relative to the Grid treatment (164 in the Salient treatment versus Calculator: the number of times in average that subjects used the calculator in the tariff task (Fig. 2); Search engine: the number of times on average that subject used the search engine. The search engine could only be used in the tariff task. Default on the first screen: proportion of times subjects chose the default in the first screen (see Fig. 1); Engagement Counting Task: number of grids subjects engaged with. This variable is only available for Grid and Salient treatments; Earnings: this variable represents the average earnings per treatment. Given that in the dual markets the number of experimental points earned were double but the conversion rate was 1/2 than in the other tasks, we have normalized this variable to account for this 51 in the Grid treatment; Mann-Whitney p < 0.001). 19 In the Salient treatment the engagement with the task and default rates are strongly positively related, unlike in the Grid treatment (Spearman ρ = 0.56, p < 0.001, in Salient, vs. 0.17, p = 0.24 in Grid). Table 2 shows averages of our three key variables for different dimensions of complexity. We will provide regression results later on to see the effects of complexity on our key variables. For now we merely notice that product bundling (dual markets vs. single markets) seems to have a small impact on the default rate and an even smaller effect on the suboptimal switching rate. The complexity of the tariffs employed seems to influence both the default tariff rate and switching rate, and, consequently, the suboptimal outcome rate. The number of tariffs has a small impact on the default rate and a more sizeable one on the suboptimal switching rate, and, consequently, on the suboptimal outcome rate.

Regressions
In this section we present three sets of regressions to shed light on what the role of inattention and complexity is in driving consumer outcomes, and which factors influence earnings, the use of search engine and calculator.

Regressions on whole sample
In this section we estimate several models on the default rate, the switching rate and the suboptimal outcome rate, plus the default rate occurring in the first screen. All the regressions have been estimated using Probit with robust standard error to control for the fact that our observations are clustered at a subject level. The variables we use in the models are as follows: Period: this variable captures learning over time. Learning implies a negative coefficient. Period ∧ 2: this is period squared. If there is learning but only up to a certain point, we expect this variable to have a small positive coefficient. Dual market: this is a dummy variable that takes value 1 if markets are for two goods and 0 otherwise. Complex tariffs: this variable takes value 1 in tasks if all tariffs are complex and 0 otherwise. Mixed tariffs: this variable takes value 1 when the task involves both complex tariffs and simple tariffs, 0 otherwise. Number of tariffs: this variable takes value 1 when the task involves 24 tariffs, 0 otherwise. Calculator: this variable takes value 1 if the subject has used a calculator in the tariff task screen (i.e. where they see all tariffs), 0 otherwise. Search engine: this variable takes value 1 if subjects use the search engine, 0 otherwise. Baseline, Web and Grid: these variables are dummies for our control treatments, equal to 1 in the Baseline, Web and Grid treatments respectively, and 0 otherwise. Our benchmark treatment is the Salient one. Nationality, Gender and Age: these are demographic variables. Nationality is 1 if subjects are British, 0 otherwise. Gender is equal to 1 for men, else 0.
Regressions on the default rate. Table 4 shows that, independently of the model estimated, the default rate decreases over time but the decrease peters out with time as shown by the combined Period and Period ∧ 2 coefficients. Subjects also stick more to the default tariff when tariffs are complex. There is strong evidence in most of the models we estimate that the default rate increases when tariffs are complex. The use of the calculator and search engine is negatively correlated with the default rate. There is also strong evidence that subjects choose less frequently the default tariff in the Baseline treatment compared to the Salient one; the Baseline has in fact the lowest estimated coefficient, by around 28-32 % depending on the model specification. In the other two treatments the effect is milder but still significant in most model specifications.
Regressions on the suboptimal switching rate. Independently of the specification of the model, we find that the suboptimal switching rate is greater when tariffs are both all complex and mixed and when the number of tariffs is 24. A greater use of the calculator is positively correlated with the suboptimal switching rate (probably because, if subjects are confused, they use the calculator more, to get an idea of which one is the best tariff). The search engine, as expected, is negatively correlated with the suboptimal switching rate. The Baseline treatment has a significantly higher suboptimal   Regressions on the suboptimal outcome rate. The suboptimal outcome rate is the sum of the default rate and the suboptimal switching rate and so it is not surprising that the complexity effects from the number and mix of tariffs, and the small one from dual tariffs, is replicated in terms of a higher suboptimal outcome rate. The search engine is negatively correlated with the suboptimal outcome rate and so is the use of the calculator, as the negative coefficient on the default rate more than offsets the positive coefficient on the suboptimal switching rate. The coefficients on Baseline are all statistically significant and always negative. Relative to the Baseline, the effect is around 18 % and between 15 and 19 % depending on the model specification.
The effects are smaller and always negative for Web and Grid but still statistically significant in three models out of four for the Web treatment and two out of four for the Grid. 21 Remark 3 The regression analysis confirms that, while the suboptimal switching rate goes down in Salient relative to the Baseline and changes only by little relative to Grid and Web, the inattention effect produced by the Salient treatment manipulation leads to a significantly higher default rate and this, in turn, leads to a significantly higher suboptimal rate. The effect size on the suboptimal outcome rate is around 20 % relative to the Baseline.
Regression on the default rate in first screen. The dependent variable in this regression is restricted to the subjects that choose the default tariff in the first screen without even having a look at all the tariffs in the subsequent screen. 22 The treatment variables coefficient estimates are similar to the corresponding coefficients in the Default Tariff regressions. This shows that the differences among treatments are mainly driven by the decisions of those subjects that stick to the default tariff in the first screen and that this takes place significantly more in the Salient treatment than in the others.

Regressions on sub-sample
In order to further test the interpretation that the differences among treatments are mainly driven by subjects who end up with the default tariff in the first screen, we now estimate regressions with the sub-sample of subjects that did not end up with the default tariff in the first screen. 23 If this interpretation is correct, we would expect the coefficients on the treatment variable to be small relative to those observed on the default rates and suboptimal outcome rates in the Table 4 regressions. 20 We find mild evidence that nationality has a positive effect on suboptimal switching rates. 21 There some evidence to suggest that men obtain less suboptimal outcomes than women, though this is not entirely robust to the model specification. Such an effect would follow from the fact that men stick less to the default tariff than women. 22 The Calculator and Search Engine variables are omitted from this regression because, by definition, they are equal to 0 if the subject simply ends up with the default tariff on the first screen. 23 Again, we employ Probit regressions with robust standard error to control for the fact that our observations are clustered at a subject level.    Table 5 reports the results of regressions on our key variables using the sub-sample. In line with this interpretation, in comparing the Table 5 with the Table 3 default rate regressions, the coefficients on Baseline, Web and Grid have dropped considerably; for example, the Baseline coefficient has dropped from around 28-32 % to just 5 % and even smaller in the Web treatment and particularly the Grid treatment were the difference with the Salient is not anymore significant. 24 This leads to smaller, and never statistically significant, coefficients in the suboptimal outcome rate regressions. Conversely, the effects of complexity and of learning appear roughly the same when comparing the full sample of Table 4 with the sub-sample of Table 5.
Remark 4 The differences in default rates and suboptimal outcome rates between Salient and the other treatments is mainly driven by subjects ending up with the default tariff in the first screen; when these subjects are removed, the differences become small, and disappear completely in the suboptimal outcome rates regressions.
Result 4 supports the interpretation that inattention to the tariffs task is responsible for the higher default rates and higher suboptimal outcome rates in Salient relative to the other treatments. Such inattention is brought about by the saliency of the alternative task. Table 6 presents Probit regression with clustered standard errors at a subject level on what affects the use of the calculator and search engine and regressions on what affects earnings. The regression on earnings is estimated using OLS estimators with robust standard errors also clustered at the subject level.

Calculator, search engine and earnings
Use of calculator. The dependent variable in this regression is as defined previously, i.e. the use of the calculator in the tariff task screen shot where subject see all tariffs in that round. According to Table 6, subjects use less the calculator over time and slightly less in markets for two tariffs and when the number of tariffs is 24. Subjects use the calculator almost every period in all treatments (on average above 82 % for all treatments), which may contribute explaining why there is little variation in the use of the calculator across treatments.
Use of search engine. The dependent variable in this regression is equal to 1 if subjects have used the search engine and 0 otherwise. Table 3 reports the average use of the search engine in all three treatments. Subjects can only use the search engine in the tariff task screen shot so these regressions are only run on the subsample where subjects have not chosen the default tariff in the first screen. Table 6 shows that subjects tend to use more the search engine as time goes by but this effect decreases over time. The 24 One might wonder whether this is a trivial implication of selecting the sample of subjects who has not chosen the default in the first screen. However, what these coefficients are picking up is not that the overall default rate is lower (which is trivially true) but rather that the incidence of default rates is very different between the Salient and the other treatments when the full sample is used, and different only by about 1/4 between the Salient and the other treatments when the cases where the default tariff was selected in the first screen are excluded. use of the calculator is a substitute tool for the use of the search engine. There is also mild evidence that subjects use more the search engine when tariffs are complex or mixed. The treatment coefficients are positive, and mildly significant in the case of the Grid treatment, suggesting a greater use of the search engine in this treatment than in Salient. 25 Earnings. Regressions on earnings complement the analysis of suboptimal outcome rates in the previous sections. 26 There is evidence of learning over time but this peters out with time. Plausibly, using the calculator and the search engines is correlated with higher earnings. The complexity effect from the number of tariffs carries through and reduces earnings, with milder evidence for the complexity of the tariffs themselves. When combining all dimensions of complexity, that is comparing the case with 24 dual and complex tariffs with that with 4 single and simple tariffs, according to the regression model the difference complexity made to earnings was around 2,800 points. 27 The inattention effect captured by the treatment effects was considerably larger than this, with coefficients between around 8,400 and 7,300 points more earned in treatments other than the Salient treatment. This points to the importance of dealing with the inattention problem rather than just focusing on the complexity problem.

Are outcomes suboptimal?
A significant fraction of consumers makes suboptimal choices, either because of sticking to a default or because of switching to a suboptimal choice. In the Baseline treatment where there is only one activity available, even with just 4 tariffs about 1/3 of the choices are suboptimal, rising to over a half when there are 24 tariffs (Table 2); subjects stick to the default in only around 15 % of cases, which does not seem to fit with real world stylized facts regarding the percentage of consumers not switching (e.g., DECC 2012). One key reason of difference is that real world consumers may simply not pay attention to saving money from switching energy supplier: their routine activities in their everyday life are more prominent. There is not a point in time in the day, the week, the month or even the year where, as a routine, subjects are required to pay attention to the task of choosing energy supplier, as there is anyway a default energy supplier; there is no equivalent of, say, the weekly major supermarket shopping trip that a household may do every Saturday morning in order not to run out of food. Conversely, subjects come to the laboratory with an expectation that they need to pay attention and engage in a task and it is no surprise that, given the availability of a search engine, they use it to get to much better outcomes, as we would expect with real world consumers as well. The question then becomes why, in the real world, consumers do not use search engines in an equally effective way. Our intuition is that, because consumers do not pay attention, they often do not get to the stage where they are faced with a search engine: the problem may be made simple but this is not enough if it is simply not in their minds.
To test this intuition, we added either a not salient task (Web, Grid treatments) or a salient (Salient treatment) alternative task for subjects to engage in. The salience of the alternative task was used as a tool to induce inattention, if much less than what can be expected in the real world. 28 In our Salient treatment, as many as around 45 % of choices stuck to the default, and even with just 4 tariffs over half of the choices are suboptimal. Overall suboptimal outcomes go up by 20 % in Salient relative to Baseline (Result 2), and our regression analysis confirms the existence of a quantitatively large effect (Result 3), and one that is replicated if one uses earnings as dependent variable (Sect. 5.3). We interpret this as an inattention effect, and this is supported by the fact that the effect is mainly driven by subjects ending up with the default tariff on the first screen (Result 4), the one with just the alternative task in the Salient treatment. It is also supported by the fact that subjects were more engaged with the alternative, counting task in the Salient treatment (Sect. 4.2).
We found a small complexity effect involving product bundling, and larger ones related to whether the tariff is linear or non-linear, and whether there are 4 or 24 tariffs. That said, the effect of product bundling was small enough that it did not affect earnings to a statistically significant degree, and the overall impact of our complexity manipulations on earnings was less than half that found for our treatment manipulations identifying inattention effects. In essence, complexity does matter; however, economists and policy makers should pay more attention to the role of inattention 29 for tasks that do not fit in the usual household consumption routines. 30 By unpacking the psychological determinants of switching costs, however, the experimental methodology does allow to provide clear-cut messages on the potential effectiveness of policy measures either tackling them or not. Furthermore, the direction for consumer welfare improvements is clear, unlike survey studies where, as per Coombs and Shaharudin (2012) critique, it is not necessarily obvious whether consumers may be getting a good deal after all.
Our experimental evidence suggests that even restrictive regulatory measures forcing tariffs to be linear and only four-with the potentially distorting effects on competition that such restrictions may have-would still only help partially. Even more so, the scope in the UK of Ofgem's (2014) measures to limit the number of tariffs provided by each firm to 4 per fuel, meter and payment type, will be only of partial help, as the number of tariffs in the market as a whole is likely to remain above our experimental upper number of 24 tariffs; and their regulation of the complexity of tariffs will still mean that there are non-linear tariffs and so we shall still be in the world of ranges of complex tariffs or of mixed complexity tariffs. There are, of course, good (technical and competitive) reasons why Ofgem cannot simply the market further. More fundamentally, Ofgem's (2014) measures will have limited impact as they do not tackle the inattention problem and this may be at least as significant as the complexity problem.

Conclusions
We found that, in markets for services and even in the presence of a search engine, consumers are likely to stick to defaults and achieve suboptimal outcomes. The experiment aimed to unpack two key psychological reasons why they do this-complexity (in terms of non-linearity, number and bundling of tariffs) and consumer inattention. By employing an experimental methodology, we are in a position not only to identify the causal role of different psychological dimensions, but we are also able to test the effectiveness of policies designed to improve consumer outcomes. Our experiment, and our tariffs, are inspired by stylized features of UK electricity and gas markets, but the lessons we draw are likely to be more general, as both underlying features (such as non-linear tariffs and the presence of defaults) and psychological mechanisms are obviously more general.
Task complexity matters to some degree. However, in the presence of a default tariff and a tariffs choice not being one which is salient to subjects relative to alternative tasks, suboptimal outcomes will be achieved because of consumer inattention. Based on our findings, Ofgem's (2014) measures to try to obtain better consumer outcomes in the UK energy retail market are likely to be of only limited impact.