1 Introduction

The way multinational ICT companies are taxed is a matter of debate that often lands in the headlines. Parallel to experiencing since the 1990s the positive impact of Web-based technologies on labour productivity and on product variety, governments also recognize that the very nature of some of the underlying business models, particularly their intangible nature, often entails an erosion of revenues from indirect and corporate income taxation. Advocates for new ways to tax these companies argue that a tax on sales, a mechanism that reallocates taxing rights, or a mixture of the two, might help restoring a “fair” level of taxation as compared to the taxation of non-ICT companies. As some governments have proposed or introduced in recent years ad hoc taxes on the turnover these companies generate locally, it is imperative to gain a complete understanding of the economic consequences of these policies in order to inform policymakers and the general public of their likely effects and social costs. The contribution of this paper in particular is on the effects of an ad valorem tax on the sales of advertising by Web companies.

The European Commission launched on 21 March 2018 an initiativeFootnote 1 with the aim to obtain a fairer allocation of tax rights in the digital market. The proposal came after several cases in which European and national tax authorities have forced some very large Web companies to pay taxes for liabilities supposedly due from past years. The debate on Web companies and their impact on tax revenues is not exclusive to Europe, though, see for instance the empirical works of (among others) Goolsbee (2000), Alm et al. (2005), Ballard et al. (2007) and Einav et al. (2014) documenting base erosion at the U.S. State level of sales tax revenues due to E-commerce. Such a kind of base erosion due to E-commerce may lead to non-trivial equilibria where tax rates are adjusted, too (Agrawal & Wildasin, 2020). The European Commission’s mentioned proposal envisaged a two-step approach, first a “targeted solution” introducing a tax on the sales from digital products and services (also named in the media Web Tax), then a more comprehensive approach which would be based on revised profit allocation of these multinationals across the Union and new rules reflecting digital presence according to the nexus principle. The focus of the present paper is about the Web Tax alone.

The announced aim of the European Commission’s Web Tax is mainly twofold: to recover lost revenues from corporate income tax (in this respect, the Web Tax would act as a substitute for the corporate tax) and also, in the European Commission’s words, “to level the playing field” by reducing the tax-induced advantage of New Economy firms vis-a-vis traditional “brick-and-mortar” firms.Footnote 2 One reason why an indirect tax would be introduced as a substitute for direct taxation of profits can be traced back to the limitations that corporate tax systems face when dealing with intangible goods and assets, which facilitate transfer pricing and allow companies to sell to residents of a country or region without any physical presence there (the latter has been for a long time a prerequisite for the application of source taxation).Footnote 3

Some authors (e.g. Auerbach et al. (2008) and specifically for the digital markets, Agrawal and Fox (2017)) have endorsed the application of a destination-based principle to corporate taxation as a comprehensive solution to profit shifting which would apply to digital and non-digital markets. Although such proposal has many merits, in this author’s view a problem with it is that both the concepts of source and destination are hard to apply when dealing with advertising-supported digital services. Leading companies like Alphabet/Google and Meta/Facebook generate most of their revenues from selling advertising while providing their digital services for free. They are able to provide Web services in a country while selling advertising space in another country where the advertiser is resident. In such cases it is not straightforward to determine, least to measure, where the tax-wise destination of a transaction is located. The nexus principle as defined by OECD (2013) which advocates the use of input factors location as a proxy for the location of value generation is not fully applicable either, as production, the location of data, company servers and most of the company workforce might be in entirely different countries unrelated to the place where consumption occurs (thus, application of the nexus principle would de facto apply a source-based principle). Hence, the idea to try and capture corporate profits indirectly, by means of a Web Tax which would be levied on an imputed value of sales allocated to each European Member State where a Web multinational operates and provides services.

Apart from the technical problems faced in applying such a Web Tax, its Welfare implications hinge on correctly gauging the incidence effects of the reform. I argue here that the nature of Web businesses providing digital services for free and selling ads to produce revenues, coupled with the very peculiar way in which these digital services match consumers with advertised content, may lead to very special conditions that make standard theory of tax incidence in oligopolistic competition regimes inapplicable. More specifically, the use of sophisticated matching algorithms based on consumers’ profiling imply that the larger the base of users is for a Web company, the more efficient the matching and, consequently, the larger the value-per-view (or per-click) for advertisers. As the total number of users served affects the willingness to pay of advertisers, it stands to reason that it implies an inverse demand function that is not necessarily monotonically decreasing in quantity. This intuition serves as the starting point for the theoretical analysis that follows.

This paper contributes to the literature by providing an extension of standard Cournot-Nash models of tax incidence to a special case where firms operate on two-sided markets. In the model, although Web companies are assumed symmetrical and compete in the model with equal market power with respect to advertising services, they can enjoy monopolistic power in their own consumer’s market. Firms acquire, or generate, “contacts” by supplying a Web service for free to consumers. They produce revenues by selling some of these contacts to advertisers in the form of ad banners, and advertising on a Web platform is more valuable for advertisers, the larger its pool of contacts is. The model is then used to better understand what are the likely effects of a Web Tax on equilibrium price, ads sold and generated contacts. The policy contribution is twofold: to provide a theoretical framework which can be used to understand the likely Welfare costs, or gains, of the reform before factoring in tax revenues; and also to suggest where future empirical analysis should focus in order to quantify such costs.

A major result I find is that, in the context of the model, adjustments to a new ad valorem tax on the side of acquired users of a Web service and on the side of sold ads move in opposite directions: Web companies may increase the quality of their service in order to gain more users, thus increasing the value ads have for advertisers and therefore increasing gross ads price which partly compensates for the tax; but, at the same time, Web companies may reduce the amount of sold ads. Based on the specific assumptions chosen for the model, the opposite may also hold true: Web companies might react to the tax by reducing investment and users, lowering gross ads price while increasing the amount of sold ads. The overall Welfare effects of such changes are not trivial and ask for more targeted empirical research to narrow down the analysis to specific demand functions and elasticities.

The paper is structured as follows. Section 2 provides motivation, in an informal way, for the idea that quantities produced and sold can differ for advertiser-supported Web services. It also provides support for the claim that larger users’ bases positively affect the reservation price of advertisers. Section 3 summarizes the related literature and contrasts it with the modelling choices taken in the next sections. Section 4 introduces a model of symmetric Cournot competition. Section 5 derives policy-relevant results for an ad valorem tax. Section 6 finally draws the main conclusions and points to further avenues for future research.

2 Motivation

Digital advertising has been growing steadily in the last two decades, while traditional marketing channels have not. Digital ads bring some distinct advantages to advertisers compared to broadcast ads. While the idea of targeting based on indirect proxies for consumers’ types is not exclusive to the Web and has been used extensively in printed, radio and TV media, Web services enable a much deeper matching between prospect consumers and ads. Ads can be “personalized” and sent to users with observable characteristics that predict higher chances to click on the ad, to purchase the advertised product or to be influenced in the intended way. The association between observables and the consumer’s behaviour is based on a large number of data points that may include: what the user does before, during and after having being exposed to an ad; what are his or her preferences with regard to content, interests, locations and several other areas; what are the associations between observing an item (for example, a search keyword) and subsequent behaviour. The current development of machine learning algorithms promises an even deeper level of matching in the near future.

Google explains its advertising services, AdWord and AdSense, as follows:Footnote 4 “With millions of websites, news pages, blogs, and Google websites like Gmail and YouTube, the Google Display Network reaches 90% of Internet users worldwide. With specialized options for targeting, keywords, demographics, and remarketing, you can encourage customers to notice your brand, consider your offerings, and take action.” It further explains that “Google automatically delivers ads that are targeted to your content or audience” by using “contextual targeting”, “placement targeting”, “personalized advertising” (which is described as follows: “Personalized advertising enables advertisers to reach users based on their interests, demographics (e.g., ”sports enthusiasts”) and other criteria”) and “language targeting”. Similarly, FacebookFootnote 5 explains its advertising facilities as follows: “Two billion people use Facebook every month. With our powerful audience selection tools, you can target the people who are right for your business. Using what you know about your customers, such as demographics, interests and behaviours, you can connect with people similar to them.” It then details how their platform would allow to “Find people based on what they’re into, such as hobbies, favourite entertainment and more” and “based on their purchasing behaviours, device usage and other activities.” Another example is provided by RedditFootnote 6 which explains: “With over 250 million users, it can be difficult to know how best to reach your audience. Interest targeting gives you the ability to pinpoint your audience [...]. With interest targeting, you can display your ad to the right audience based off a user’s browsing behavior on Reddit! [...] Targeting an interest group means you are targeting users who have expressed interest in a specific type of content. For example, a user who engages in a post relating to sports will be shown sports ads for a period of time after engaged with that type of content. As a user engages in different content their interest categorization will dynamically change, ensuring all ads are relevant to that user.”

From these examples, and from many more that are easily found on the Internet, common characteristics of these technologies are made clear. First, users are constantly analysed with respect to their observable behaviours. Second, these behaviours are codified and stored. Third, the stored data are used by automated algorithms to match users with ads (based on keywords or criteria provided by the advertiser, or possibly through fully automated matching). The number of users on a given Web service plays a key role, as having larger numbers enhances prospects for the algorithm to find a good match for a given ad. This is particularly true for advertised products that cater to a niche demand and therefore benefit the most from having their ad seen or clicked by a good match. It is just the case to highlight that in these three notable examples (Google, Facebook and Reddit) all of them stress, as the very first information provided, the very large number of users they can potentially reach.

The consequences of these observations for economic theory are, first, that the quantity produced by a Web company can be different from the quantity sold. That is, the number of potential visualizations per user, times the number of users of a Web service, can (and will likely be) larger than the number of ad space sold to advertisers. The reason for this discrepancy is not only found in the desire to avoid congestion of the service (too much advertising could make it less appealing for users), but most importantly, because as profiling and matching algorithms improve thanks to technological advances, having more users improves the value of each ad for advertisers and therefore entails larger willingness to pay. As a Web company increases produced quantities (which is the same to say: it increases its users base) but does not increase the amount of sold ads, the price of its ads may increase without impacting ads price. On the contrary, as sold ads increase without a company changing the amount of service provided to users, oligopoly market price will go down as per usual decreasing inverse demand functions. When these effects are at play, standard tax incidence theory does not transfer well because inverse demand functions are not necessarily monotonically decreasing any more. These observations ask for a specific modelling of ads-supported Web services to understand the likely effects of indirect taxation on such markets.

3 Related literature

Broadly speaking, this paper is related to the literature on indirect tax incidence, e.g. Weyl et al. (2013). More specifically, a relatively recent literature addresses the effects of indirect taxation on digital companies in two-sided markets, see for instance: Kind et al. (2008, 2010, 2013), Kind and Koethenbuerger (2018).

In Kind et al. (2010) in particular, under the assumptions that consumers pay a positive per-unit price to buy newspapers and are served by a monopolistic platform who also collects revenues by selling advertising space on its newspapers, it was found that an ad valorem tax on revenues from sales increases ads sales and reduces ads price. This model is extended to a Hotelling duopoly where competition happens on prices and on the degree of product differentiation, finding similar results. In Kind and Koethenbuerger (2018), a monopolistic digital platform provides a good and advertising space, both for a price, and find that the effects of a tax depend upon whether users like or dislike advertising. In particular with ad-averse users and advertisers getting more value from ads if the users base is larger, the tax may increase output on both sides of the market (advertising and final users), while the own-tax elasticity of ads sales is always negative. The latter paper produces, like this model, the result that ads price may be reduced by a tax, though (contrary to this model) such result requires that consumers are averse to advertising.

A paper which is also related is Bourreau et al. (2018), where a monopolistic digital platform provides online services to users for a fixed access fee and zero unit price. The monopolist in this model sets prices to maximize profits and exploits personal users’ data to provide them personalized services and to sell targeted advertising to online sellers. Users value the possibility to buy from well-matched sellers and receive negative utility from uploading more personal information. They find that for a free-for-use platform, an ad valorem tax generally increases prices and reduces sold ads and user-provided data. This result hinges on the assumptions that users care about the sellers’ behaviour, while in this model I do not impose any assumption about consumers’ evaluation of potentially useful offers.

The model presented hereafter departs from these cited works in several important ways. First, I assume that Web companies compete against each other in a Cournot-Nash advertising market, but at the same time they are leaders, or monopolists, in their own Web service market (the latter assumption is supported by large evidence, see for instance Haucap and Heimeshoff (2014)). This assumption in my view better represents observed conditions in digital industries, where giant companies like Alphabet/Google and Meta/Facebook compete over the same advertisers but enjoy a strong market power each in its own area (e.g. respectively, search engines for Google and social networks for Facebook). This assumption should therefore improve the external validity of the model vis-a-vis models assuming a monopolistic regime. The latter is the most commonly assumed regime in the works previously referenced, with the sole exception of Kind et al. (2010) which discusses a Hotelling competition regime. Second, I assume consumers do not pay any price to access Web services, but they face private variable costs due to the opportunity time needed to use these services. This assumption also better serves external validity as the top global ads-supported Web services (e.g. Google, Youtube, Facebook, Reddit, Yahoo!, Twitter) all provide their services at no charge. The latter assumption, combined with the idea that the usefulness of ads improves with the number of users, bears an important consequence which is better explained in the following section: Web companies might produce more Web contacts than the number of contacts sold to advertisers.

There are other relevant differences between the present model and the previous literature. I assume consumers to be neutral w.r.t. advertising, meaning I provide an analysis that disregards users’ tastes about ads intensity. The latter assumption, although a departure from reality, is meant to provide an analysis that is free from confounding elements related to user’s preferences, which are still debated in the literature. Moreover, I do not assume that Web services are platforms providing direct sales facilities (contrary for example to Bourreau et al. (2018) where the better matching for ads is valued by consumers as it leads to higher chances to make a valuable purchase), because I am interested in a situation where advertising may be related to any content, thus in principle applying to brand awareness campaigns and political advertising, too. The obtained results are therefore more general and apply to a broader range of ads-supported Web content. I am then able to show that the result according to which ads sales might increase and ads prices decrease after introducing an ad valorem tax on advertising revenues is not necessarily linked to consumers’ preferences about ads or potential purchases. In the model the link between ads and users markets is due to investment behaviours of the Web companies and their ability to separately affect sold ads and the number of users, through changes in the service quality. Finally, also contrary to Bourreau et al. (2018), I do not allow users to choose how much personal data to disclose, rather I assume that the provision of personal data happens passively as a by-product of using Web services (e.g. a user employs a search engine, and in doing so reveals to the supplier of the service behavioural patterns through clicks and searched key words). I believe the latter to be closer to actual data patterns found in major services like Google, Reddit and Youtube, where active voluntary data disclosure by users is minimal.

On the empirical side there is still a scarcity of studies documenting the effects of taxation on ads prices. A paper that is somewhat related is Lassmann et al. (2020) which studies the effect of corporate taxation on Facebook ads prices. The authors find that ads prices increase with tax rates with a significant pass-through of taxes (in the preferred analysis scenario, overshifting of taxes on ad prices is found to generate a profit tax pass-through rate between 1.23 and 2.68). The intensity of the pass-through they found is similar to what other studies find for VAT in general. To the extent that results obtained for the corporate tax can be taught of as a proxy for similar behaviours related to ad-valorem taxes, they would therefore predict a rise in ads prices and a reduction in the amount of sold ads.

4 The model

This section illustrates the model, first by presenting the three types of agents operating in this stylized economy (Subsect. 4.1), then discussing aggregation and equilibrium conditions (Subsect. 4.2). Subsection 4.2 will also present one of the major results stemming from the analysis, which points at the possibility for Web companies to undersell their produced Web contacts.

4.1 Agents of the model

There are three types of agents in the model: consumers (I also use the term users, interchangeably), advertisers, and Web companies (also named firms).Footnote 7 Figure 1 graphically illustrates the interactions between the model’s three types of agents.

Fig. 1
figure 1

An illustration of the different agents in the model

Consumers There are I consumers. Each consumer decides how much of each Web service to consume. Each unit of consumption of a Web service should be thought of in the context of the specific Web service. For instance, one unit of consumption might represent making a query on a search engine and reading the first returned page containing ten results. Or, on a social network service, a single unit of consumption might represent reading through a fixed number of posts (which may include a mixture of text, images and multimedia content).

Consumers are characterized by a quasilinear well-behaved utility function \(u_i({\bar{k}}, {\bar{z}}, y) = \sum _j \theta _{ij}(k_{ij}, z_j) + y\) and by a budget constraint \(\sum _j h k_{ij} + y = M\), where M is exogenous income, y is a Hicksian-composite good with price equal to 1, \(k_{ij}\) defines the units of consumption of a specific Web service j of quality \(z_j\) by consumer i (each unit \(k_{ij}\) is assumed normalized in order to be equivalent to a single advertising “contact”) and h is the price of each unit \(k_{ij}\). I assume that \(\frac{\partial {\theta }}{\partial {k}} >0\), \(\frac{\partial {\theta }}{\partial {z}} >0\), \(\frac{\partial ^2{\theta }}{\partial {k}^2} <0\), \(\frac{\partial ^2{\theta }}{\partial {z}^2} <0\), \(\frac{\partial ^2{\theta }}{\partial {k}\partial {z}} >0\). In this setting, h is the opportunity cost of time spent online in order to consume one unit of the Web service (thus, h does not represent a price paid to Web companies), therefore the budget constraint is the potential income M that an individual might spend on the Hicksian-composite good while consuming zero units of the Web services. Note that advertising does not show up in the utility function, which equates to assuming that changes in the intensity of ads shown per unit of consumed Web service do not affect utility.

Function \(\theta _{ij}(k_{ij}, z_{j})\) is assumed consumer-specific, meaning that the same amount of consumption of a given Web service j of quality z generates different levels of utility in different consumers. I assume the existence of a very large crowd of (potential) consumers and that functions \(\theta _{ij}(.)\) are uniformly distributed across such population, consequently each symmetric Web service will face the same average \(\theta _j(.)\). Again for simplicity and tractability, I assume consumers are homogeneous w.r.t. h and I. The other source of heterogeneity across consumers is an entry cost (to be interpreted as the cost, in terms of time and effort, required to learn how to use the new Web service), which is modelled as a utility level \(\theta _{E}\), such that a consumer will opt for the Web service if, and only if, \(\theta _{E} < \theta (k^*, z)\), where \(\theta (k^*, z)\) stands for the utility obtained at the optimum, given the observed values for \({\bar{k}}, {\bar{z}}, y\). I also assume that the distribution of functions \(\theta _{ij}\) and \(\theta _{E}\) are uncorrelated across the population of potential consumers. Taken together, these assumptions imply that each consumer will consume positive quantities of a given Web service only if the expected utility obtained from such consumption surpasses the reservation level \(\theta _{E}\).

Web companies There are J Web companies, each operating in a distinct market j. Web companies obtain their revenues by selling advertising space at a unit price \(p_j\). Web companies act as price takers, thus competing in a Cournot fashion. At the same time, though, they can affect price directly by investing in the quality of their Web service. By attracting a large number of users to their Web service, Web companies generate “contacts”, which are then sold for a price to advertisers. A contact should be interpreted as some standardized measure of usage of a Web service, which matches the corresponding unit of measure used to quantify consumption \(k_{ij}\) by an individual consumer. For example, a contact can be the visualization of a single Web page, or of a fixed number of posts, items of a list, or multimedia objects. A contact, in the context of this model, is therefore a concept that comes close to the definition of a visualization as used to define Web ads’ Cost-per-Mille (CPM). A contact produces, as a by-product of the consumer’s experience of the Web service, the possibility to show the user a single ad.

Each Web company provides its Web service free of charge and is able to improve the appeal of its Web service, thus affecting quality z, at a cost. An increase in z can be interpreted as improvements made to the Web service, such as better interface, more appealing content, larger capacity, faster responsiveness. Similarly, a reduction in z represents a deterioration of the service, for example longer queuing or downloading times due to more stringent bandwidth limitations, lower quality of media content, etc.. Note that z captures at the same time the underlying quality of the infrastructure and of contents that are not generated by users. For instance, rating platforms like TripAdvisor or Yelp may want to increase quality z by hiring personnel to check more often, or more thoroughly, whether reviews are authentic, thus making their Web service more useful for users. Another example are user-side matching algorithms: Facebook for instance might invest more in R&D in order to improve the matching of prospect “friends” and contents on its platform, thus increasing the value for the users. All such increases of the quality variable \(z_j\) will consequently obtain both an increase in contacts \(q_j\) (see below) and of total investment. This will motivate the introduction, in the following, of a cost function expressed as \(c(q_j)\). Given the observed demand for ads and the vector of qualities in the other product markets, a firm decides its own quality level \(z_j\) which uniquely determines the amount \(q_j\) of produced contacts. Equivalently, a firm maximizes profit by choosing a level of produced contacts \(q_j\), thus from now on the notation will concentrate on this variable to economize space, though one needs to always keep in mind that such amount of produced contacts stems from the choice of how much to invest in quality.

Net profit for firm j is:

$$\begin{aligned} \pi _{j}(q_j, s_j) =p(q_j, s_j + D_{-j}) s_j -c(q_j) \end{aligned}$$
(1)

where \(s_j\) represents the amount of ad space sold to Advertisers. The model postulates a mapping of Web contacts \(q_j\) to sold ads \(s_j\), such that sold ads can never exceed \(q_j\). In order to better clarify how consumed units of a Web service by a consumer can be expressed in units of an advertising contact, the following example can be of use. Consider a search engine (like Google) that places a fixed number of sponsored results on the first page of a search. If, for example, such number is three, then one unit k of consumed Web service by a single Consumer, which in this example is one page of search results from the search engine, would correspond to three ad banners. Thus, by assuming a fixed ads-to-service ratio, the model allows to treat consumed units of a Web service as contacts, and each contact as a potential Web ad. The way individual consumption \(k_j\) maps to aggregate acquired contacts \(q_j\) is next illustrated in Subsect. 4.2.

Ads price stems from competition against other Web companies, hence they are represented by an inverse demand function p(.) which is function of the total amount of sold ad contacts (from the \(j^{th}\) Web company and also from the other \(J-1\) Web companies), and of the amount of contacts acquired by Web company j. Thus \(D_{-j}\) denotes the aggregate ads demand, net of the demand \(s_j\) served by firm j. Thus, \(s_j + D_{-j}\) represents total ads demand in the market. Function p(.) is also positively affected by the level of \(q_j\) because, as explained above, a Web company can invest more in quality at a cost \(c(q_j)\), attract more contacts and thus make the ads on its platform more valuable for the Advertisers.

Advertisers There are also N Advertisers. Advertisers are a large number of businesses who choose the amount of advertising contacts \(a_j\) to buy from Web company j, at unit price \(p_j\), in order to maximize their profit function

$$\begin{aligned} \pi _{advertiser} = \sum _J g(a_j, q_j) - \sum _J p_j a_j \end{aligned}$$
(2)

where \(g(a_j, q_j)\) is an increasing concave function (with \(\frac{\partial {g}}{\partial {a}} >0\), \(\frac{\partial {g}}{\partial {q}} >0\), \(\frac{\partial ^2{g}}{\partial {a}^2} <0\), \(\frac{\partial ^2{g}}{\partial {q}^2} <0\), \(\frac{\partial ^2{g}}{\partial {a}\partial {q}} >0\)) representing value added obtained by each Web contact on service j. The value advertisers get from each contact, as already explained, is also increasing function of the total contacts \(q_j\) that are potentially reachable through the Web company they chose to buy from.

Note that the model does not differentiate between ads, as each and any ad in the model has the same value for advertisers. This means that showing one ad once to a new consumer has the same value to an advertiser as showing the same ad one additional time to the same consumer. This way of treating ads simplifies considerably the analysis and is coherent with a traditional ”cost-per-mille (CpM)” ads pricing strategy, where only the number of shown impressions counts. It does not allow for pricing ads as unique impressions, where only the first time the ad is displayed to a particular user matters.

4.2 Aggregation and equilibrium solution

Given the aforementioned assumptions, by standard economic reasoning individual consumers’ demands can be aggregated, for each Web service j, into an aggregate demand function \(K_j({\bar{z}}, h, M)\) which is increasing in \(z_j\) and depends also on the quality of the other Web services (the vector \({\bar{z}}\) represents all values \(z_j\) for each j in J). Note that demand \(K_j\) is measured in terms of “contacts”, which are obtained as the product of an extensive margin (the number of users willing to consume Web service j) times an intensive margin (the amount of the Web service the latter demand to consume). Thus, the previous assumption according to which Advertisers’ profit increases with the number of contacts available from their Web service of choice stems from the fact that contacts also increase via the extensive margin (as explained in Sect. 2: it is the number of different users, more than the intensity of their usage of the Web service, which may improve ads matching).

In order to deal with the assumption of symmetry held throughout the analysis of a Cournot-Nash oligopolistic market, I assume that each firm faces an identical demand. Therefore, at the equilibrium, the values for chosen quality \(z_j\) and acquired contacts \(q_j\) will be the same across the J firms and there will be a single price p for all ads, regardless of the service they come from (the j index from \(K_j(.)\) can therefore be dropped, thus writing it as K(zhM) , with z having the same value for all J products). The latter equates to assume that all firms face the same competition conditions in each of the consumption market they serve. This assumption is meant to represent an economy where few large Web companies enjoy a large monopolistic power in each of their own served market (e.g. Google in the search engine market, Facebook in the social networks market), and also face similar conditions with respect to the competition regime for non-leader companies. Each company is further assumed to serve a distinct market, so I rule out the possibility of Web companies competing both on ads and on the same served Web market. By the same token, in the following I will simply write the demand function as a single aggregate demand function D(pq) , decreasing in p and increasing in q. Thus, each Web company faces a different Web service demand \(K_j(.)\) (though, as stated, symmetry implies that all \(K_j\) values end up being equal), while they all compete for the same advertising demand D(.) . Aggregate demand for ads D(pq) can be interpreted as a classical demand function, for each given (fixed) number of produced contacts q.

Because for each level of service supply z, and thus of acquired contacts \(q = K(.)\), the level of advertising demand D is uniquely determined by the equilibrium price, one can write p(qD) as the inverse demand for advertising given the symmetric quantity of acquired contacts q, and c(q) as the cost function for Web companies to produce a quality level z for their Web service. Such quality level z is able to attract \(q=K(.)\) contacts. Note that from now in order to make the text more readable, the quantities q, s and z will also be written as such (without the subscript j) to denote respectively the same amount of acquired contacts, sold ads and investment in quality by each of the J symmetric Web companies. Figure 2 summarizes the concepts just illustrated in diagrammatic form, by splitting the decision tree of Web companies into two ideal phases, one which is about how much to invest in quality in order to attract consumers to the Web service and produce contacts, and a second phase where it is decided how much contact-equivalent Web space to sell to advertisers.

Fig. 2
figure 2

A diagrammatic representation for Web companies in the model

The model allows to have the number of contacts sold to advertisers lower than the number of produced contacts q. Thus, it can be that \(s<q\). Hereafter Proposition 1 demonstrates why it can be efficient for a firm to sell a number of ads that is smaller than the number of produced contacts.

Referring again to the Web company’s profit function (1), I assume that:

$$\begin{aligned}&\frac{\partial {p}}{\partial {q}} > 0 \end{aligned}$$
(3)
$$\begin{aligned}&\frac{\partial {p}}{\partial {s}} < 0 \end{aligned}$$
(4)
$$\begin{aligned}&\frac{\partial {c}}{\partial {q}} > 0 \end{aligned}$$
(5)
$$\begin{aligned}&s \le q, s \ge 0, q \ge 0 \end{aligned}$$
(6)

The first two conditions express that price increases with q (because of raised willingness to pay by advertisers) and decreases with s (because, by standard arguments and keeping all other variables constant, the equilibrium price decreases with sold quantities). Costs are assumed increasing with quantity q, while the last inequality, \(s \le q\), sets the constraint that the number of contacts sold to advertisers can never be larger than the number of contacts produced by the firm.

First-order conditions (FOCs) to maximize (1) are:

$$\begin{aligned}&\frac{\partial {p}}{\partial {q}} s = \frac{\partial {c}}{\partial {q}} \end{aligned}$$
(7)
$$\begin{aligned}&\frac{\partial {p}}{\partial {s}} s + p(q, s_j + D_{-j}) = 0 \end{aligned}$$
(8)

Second-order conditions (SOCs) are:

$$\begin{aligned}&\frac{\partial ^2{p}}{\partial {q}^2} s - \frac{\partial ^2{c}}{\partial {q}^2} < 0 \end{aligned}$$
(9)
$$\begin{aligned}&2 \frac{\partial {p}}{\partial {s}} + \frac{\partial ^2{p}}{\partial {s}^2} s < 0 \end{aligned}$$
(10)
$$\begin{aligned}&\frac{\partial ^2{p}}{\partial {s}\partial {q}} s + \frac{\partial {p}}{\partial {q}} < 0 \end{aligned}$$
(11)

Note that these SOCs, together with assumptions under (3)–(5), imply that in order to allow for a solution with a positive amount of ads sold, it must hold true that \(\frac{\partial ^2{p}}{\partial {s}\partial {q}} < 0\). Moreover, the ratio \(\frac{\frac{\partial ^2{c}}{\partial {q}^2}}{\frac{\partial ^2{p}}{\partial {q}^2}}\) needs to be positive, which means that with convex costs one would also need ads price being convex in q. Alternatively, concave or linear costs (\(\frac{\partial ^2{c}}{\partial {q}^2} \le 0\)) would require that \(\frac{\partial ^2{p}}{\partial {q}^2} < 0\). The SOC in (3) also requires that \(|\frac{\partial ^2{c}}{\partial {q}^2}|>> |\frac{\partial ^2{p}}{\partial {q}^2}|\) in order to allow for reasonably large values for \(s^*\).

Substituting (8) into (7) and rearranging the FOCs can be rewritten as:

$$\begin{aligned}&p(q, s_j + D_{-j}) = \left| \frac{\partial {p}}{\partial {s}} \frac{\frac{\partial {c}}{\partial {q}}}{\frac{\partial {p}}{\partial {q}}} \right| \end{aligned}$$
(12)

Equation 12 highlights the role of having more contacts, which enhance the value obtained from advertisers. As the number of Web companies raises, contacts produced by each will decrease and so the value of ads for advertisers, and the intensity of this effect is captured by \(\frac{\partial {p}}{\partial {q}}\) in Eq. 12. Thus, an increase in the number of competing firms reduces equilibrium price not only via standard arguments from Cournot-Nash literature, but also because of the specific mechanism due to matching algorithms discussed above.

To conclude the definition of a Cournot-Nash equilibrium, the following equalities must also hold. Equation (13) states that the sum of all sold ads must equate aggregate ads demand. Equation (14) exploits Eq. (12) and imposes that all ads in the J symmetric markets are sold for the same price, as determined by the inverse demand function \(p(q^*, D)\).

$$\begin{aligned}&D = \sum _J s_j^* \end{aligned}$$
(13)
$$\begin{aligned}&\left| \frac{\partial {p}}{\partial {s}} \frac{\frac{\partial {c}}{\partial {q}}}{\frac{\partial {p}}{\partial {q}}} \right| = p(q^*, D) \end{aligned}$$
(14)

The following Proposition demonstrates that an equilibrium can feature Web companies underselling available contacts to advertisers.

Proposition 1

A solution to the firm’s profit maximization problem can feature underselling of produced contacts to advertisers. In symbols, this equates to state that a solution to the profit maximization problem can feature \(s^* < q^*\).

Proof

See the "Appendix". \(\square\)

The ability for Web companies to undersell available contacts to advertisers can be better understood by considering how a Web page is occupied by ads. The intensity of ads showing up on a Web page can vary in terms of frequency (e.g. after how many minutes of video playback a new ad is shown to the user), space (e.g. how many pixels on screen are occupied by ad banners) or time (e.g. how long a pop-up ad remains visible). As the present model assumes that such an intensity in ads does not enter consumers’ utility directly, Web companies in the model may safely increase such intensity without affecting the number of acquired contacts. In terms of how realistic such set of assumptions is, I believe they approximate real-world markets fairly well as long as one also assumes that ads intensity never reaches such excessive levels as to make consumption of the underlying Web service excessively clumsy or virtually impossible.

A condition for having a solution as in Proposition 1 is \(s^* > \frac{p}{|\frac{\partial {p}}{\partial {s}}|}\). Given that s captures number of contacts sold, thus a number that can very likely be in the ballpark of millions, the condition \(s^* > \frac{p}{|\frac{\partial {p}}{\partial {s}}|}\) is not unlikely to hold in practice even if \(\frac{\partial {p}}{\partial {s}}\) is, in itself, a very small value. Also note that, as the number of firms J decreases, each firm will command larger market shares in the ads market and thus, it will be more likely that the condition \(s^* > \frac{p}{|\frac{\partial {p}}{\partial {s}}|}\) holds true.

From previous assumptions, firms can decide to produce more than they sell, and the exceeding production still affects the demand for advertising space. This special mechanism stems from the fact that one additional produced contact has two distinct effects: it increases total costs, and it also increases the price p of each sold contact to advertisers. Sold contacts s on the other hand also bear two effects: to increase revenues, and to reduce the equilibrium price of ads. As these effects can have different intensities, contacts produced and sold coincide only under very specific parametrizations of the model such that the marginal cost (net of the marginal contribution to revenues) from one produced contact is exactly equal to the marginal revenues from one sold contact.

5 Welfare and tax incidence

Total welfare in the economy described by the model in Sect. 4 is obtained, at the symmetric equilibrium, by summing up all profit for Web companies and for advertisers together with consumers’ utility:

$$\begin{aligned} W = V(q^*) + (IM-h J q^*) + G(s^*, q^*) - J c(q^*) \end{aligned}$$
(15)

where \(V(q^*) = \sum _I \sum _J \theta _{ij}(k_{ij}, z^*)\) expresses total sub-utility for all consumers at the (symmetric) level of quality \(z^*\) associated with equilibrium quantity of contacts \(q^*\), while \(G(s^*, q^*) = \sum _N \sum _J g(a_j, q^*)\) is aggregate gross profit generated by all advertisers, given the equilibrium values for s and q.

Noting that function g(.) is by assumption increasing in both of its arguments and that s does not enter any of the other addenda in (15), the following Proposition is immediately verified.

Proposition 2

Starting from any equilibrium solution with underselling of produced contacts to advertisers (\(s^* < q^*\)), an exogenous increase of sold ads s is Welfare-improving.

Proposition 2 implies that underselling of ads is, in the context of the model, always detrimental to Welfare. The intuition is straightforward: selling additional ads from already acquired contacts does not imply any additional production cost for Web companies, and because of the assumed neutrality w.r.t. ads, consumers’ utility is unaffected as well. However, advertisers would benefit from more advertising space at a lower unit price. Thus from a policy-maker’s perspective this market acts like a standard Cournot market in the sense that the higher concentration of Web companies constraints supply of ads suboptimally, compared to an ideal social first-best supply.

However, trying to increase the number of competing firms to improve Welfare, as in a market featuring standard Cournot competition, may be an ill-conceived policy. First from Eq. (15) it is clear that rising J exogenously does not necessarily cause Welfare improvements as the outcome of such change would depend on a number of assumptions regarding the functional forms and numerical parameters used to estimate such effect. Second, Web companies in this economy offer each a distinct Web service and serve a demand that is quick to shift from one competing service to another offering even slightly better features (such that an often cited motto for the Web states: “competition is at the distance of a click”). Moreover and even though I did not include such characteristic in the model, consumers too might value the number of fellow users of a platform, for instance on social networks and messaging applications, which would further question the Welfare effects of splitting the user base of such services.

A tax or subsidy on the price of ads might be employed to increase Welfare. On top of the practical issue of having a dependable estimate for the elasticity of price in order to set the rate right, the following Proposition 3 adds one more challenge.

Proposition 3

Starting from an equilibrium without taxes and with underselling of produced contacts to advertisers (\(s^* < q^*\)), the introduction of an ad valorem tax \(\tau\) on ads sales affects the equilibrium sold ads (\(s^*\)) and produced contacts (\(q^*\)) with opposite signs.

Proof

See the "Appendix". \(\square\)

The intuition behind Proposition 3 is that the tax either is such that, at the new equilibrium, there will be higher ads price, less ads sold but larger investment in quality in order to attract more contacts and thus support the larger ads price; or alternatively, there will be a lower ads price, larger quantity of sold ads, but less investment in quality resulting in a smaller number of acquired contacts.

From Proposition 3 one sees that increases in s are accompanied by a reduction of q, which enters the Welfare Eq. (15) by both reducing ads value for advertises and the quality of Web services for consumers. Thus, Welfare improvements due to a tax of this kind are not guaranteed and depend upon the specific parametrization.

5.1 Isoelastic inverse demand function

In order to further advance the policy analysis, in this subsection I turn to a specific functional form for the inverse demand function.

Proposition 1 holds regardless of the specific functional forms chosen for p(.) and for the cost function, provided that they fulfil the requirements from the initial assumptions in (3)–(6) and the additional requirements stemming from the SOCs in (9)–(11). I here choose the following functional form:

$$\begin{aligned} p(.) = \frac{(1+q_j)^{\alpha }}{(s_j+D_{-j})^{\beta }} \end{aligned}$$
(16)

with \(\alpha \ge 0\) and \(\beta > 0\). This function is convenient for multiple reasons. First, it complies with previous assumptions (\(\frac{\partial {p}}{\partial {q}} \ge 0\) and \(\frac{\partial {p}}{\partial {s}} < 0\)), and it is such that \(\frac{\partial ^2{p}}{\partial {s}\partial {q}} < 0\), \(\frac{\partial ^2{p}}{\partial {s}^2} > 0\) and \(\frac{\partial ^2{p}}{\partial {q}^2} < 0\) if \(\alpha <1\). Thus, it also complies with SOCs and guarantees an interior solution. Second, if \(\alpha = 0\) it reduces to the well known constant elasticity demand function used in many previous works dealing with indirect taxation under Cournot-Nash competition. Thus it can be considered as a generalization of the constant elasticity function and therefore it allows for better comparability between this model and models in the literature. Third, it is such that \(\lim _{D \rightarrow \infty } p(.) = 0\) for any \(\alpha \ge 0\), which is also convenient (in terms of its external validity) as it means that any departure from the traditional constant elasticity function will still imply a negative relation between price and demand. When \(\alpha \ge 0\) the curve shifts upward with increases in \(q_j\), while the elasticity of price w.r.t. demand remains constant across the function domain. Finally, isoelastic demand functions bring some benefits in terms of making the model empirically testable (though such testing goes beyond the scope of this paper) as econometric methods to estimate price elasticities are well known and readily available.

The following Proposition 4 derives the conditions for an interior solution. Assuming demand as in (16) I obtain that a market equilibrium exists only if the elasticity of price w.r.t. sold ads is above one. Conditions related to the value for the \(\alpha\) elasticity are less clear cut and also depend on the specific cost function.

Proposition 4

If the inverse demand function is in the form as in (16), an equilibrium solution exists iff:

  1. 1.

    \(\beta > 1\)

  2. 2.

    Either \(0 \le \alpha < \frac{(1-\tau )s_{j}}{\frac{\partial {c}}{\partial {q}}(s_{j}+D_{-j})^{\beta }}\) or \(1< \alpha < \frac{\frac{\partial {c}}{\partial {q}}(s_{j}+D_{-j})^{\beta }}{(1-\tau )s_{j}}\)

Proof

See the "Appendix". \(\square\)

Note that the FOCs suggest something about the conditions leading to an equilibrium with either \(s < q\), or \(s = q\). Indeed the FOC obtained by equating to zero the marginal profit w.r.t. s obtains that \(s^{*} = \frac{D_{-j}}{\beta -1}\), which means that the optimal amount of sold contacts \(s^{*}\) is decreasing in \(\beta\) (again, assuming \(\beta > 1\) to limit the analysis to feasible market equilibria). Because \(s^{*}\) by definition has an upper bound in \(q^{*}\) (as Web companies cannot sell more contacts than they have), one may state that an equilibrium where \(s^* = q^*\) is more likely as \(\beta > 1\) is small and closer to 1. The case \(\alpha =0\) produces an equilibrium where \(s=q\), because a Web company has no incentive to raise q above what is strictly needed to meet the optimal amount of sold ads \(s^*\) (as in this case larger q increases total cost but does not raise advertisers’ willingness to pay).

The following Proposition determines the sign of the change in quantities sold and produced.

Proposition 5

If the equilibrium features underselling of produced contacts to advertisers (\(s^* < q^*\)) and the inverse demand function is in the form as in (16), an increase in an ad valorem tax \(\tau\) always increases the equilibrium price p.

Proof

See the "Appendix". \(\square\)

Taken together, Propositions 4 and 5 imply that a rise in the tax produces a decrease in the quantities sold to advertisers and an increase in number of served consumers. Thus, assuming the functional form in (16) and large enough elasticity (\(\beta > 1\)) leads to similar predictions as standard Cournot-Nash oligopoly papers predicting a reduction in quantities sold and increase in equilibrium price in response to an ad valorem tax. Also, this result is similar to Bourreau et al. (2018) where a monopolist serves a two-sided market and consumers negatively value advertising, though it stems from a distinct mechanism. It provides instead a result that is opposite to Kind and Koethenbuerger (2018) where the model finds a tax would reduce price and increase sold ads. It is to note, though, that this result depends on the choice of the functional form for the inverse demand function and at best provides a special case.

Proposition 5 indicates that a large ads price elasticity (which means, smaller than − 1) is a key factor determining the sign of the changes in sold ads and acquired contacts. The empirical results in Lassmann et al. (2020), which suggest overshifting of profit taxes on ads price, may provide a proxy for similar pass-through of ad-valorem taxes on ads. Then, the present model coupled with the use of function (16) implies that the price elasticity of ads should be quite large. Estimates of advertising demand are available for newspaper and television advertising. Argentesi and Filistrucchi (2007) study the Italian newspaper market which is a two-sided market with paying readers on one side and advertisers on the other, and find own-price elasticities ranging between − 0.91 and − 0.33. Wilbur (2008) focuses on U.S. television broadcasting networks and find an elasticity of − 2.9, which is reported to be much larger than similar estimates produced in the 1970s, the latter always being between − 1 and 0 (such change is explained by the author pointing to increased competition in that market). These estimates are both very heterogeneous and also not directly applicable to digital advertising, though they somewhat provide a proxy estimate. This model asks for more empirical research specifically oriented at estimating Web advertising price elasticities, in order to provide dependable predictions abut the Welfare implications of the Web Tax studied here.

5.2 A note on consumers’ utility and behaviour

The model is based on a number of assumptions on consumers which were imposed not only to improve tractability, but also in order to obtain more clear-cut results. Nevertheless these assumptions may significantly depart from reality. Thus, in this subsection I address informally how they might affect the main results.

The model rules out network effects for consumers. Consumers, however, might obtain larger utility from a Web service if more people use it. For example a social networking platform could be more appealing if many people are there (because more people are then reachable or because they will produce more content). Because in the model increasing the quality of a Web service (the variable z) induces more users into a Web service, consumers’ network effects may be simply thought of as a situation where consumers gain larger increases in utility when consuming higher-quality Web services. In model’s symbols, this equates to assume larger values for \(\frac{\partial {\theta (.)}}{\partial {k}}\) and \(\frac{\partial {\theta (.)}}{\partial {z}}\). It does affect the model in a quantitative way, but not qualitatively.

The assumption about ads’ utility is a more important departure from reality. On the one hand, consumers might be annoyed by receiving lots of ads. But, they might even obtain some positive utility from them, to the extent that ads are well matched to consumers’ preferences and thus, they provide them with interesting offers and commercial information. The latter seems more likely when ads take the form of non-invasive banners and videos which do not pose an obstacle to the fruition of the underlying Web services. The former condition (disutility of Web ads) is more likely, instead, when ads are more invasive, for example when a promotional video starts automatically and the user is forced to watch it before being able to go back to using the Web service (as it is the case with services like Youtube). If one assumes that ads negatively affect utility, then a case is made for underselling of ads being even more likely, as any increase in sold ads would simultaneously reduce the number of acquired contacts, which would feed back as lower price-per-ad. In a way, assuming a dislike for ads would strengthen the ground for Proposition 1 and the consequent analysis. On the contrary it would question Proposition 2: if the disutility of ads is large enough, an exogenous increase in sold ads can be Welfare-deteriorating.

6 Conclusions

Base erosion of corporate taxation, both direct and indirect, is pushing governments to introduce policy reforms aimed at limiting revenue loss and tax-induced advantages benefiting Web businesses. One of the proposed policies takes the form of a special ad valorem tax on ads sales (so called Web Tax). I studied the effects of a Web Tax in a setting where Web companies compete in a Cournot-Nash fashion to sell advertising space to advertisers, while they enjoy monopolistic power in the market for their Web service which they provide to users for free. In this model, Web companies can choose to increase investments in order to improve the quality of their service thus attracting more users, and in doing so they can enhance the value paying advertisers obtain from ads. It can be therefore beneficial for Web companies to have more contacts than the quantity sold to advertisers in order to keep ads price high.

In such setting, I demonstrated that a Web Tax affects quantities produced and sold in opposite ways. The latter is a rather general result, which hints at the fact that a Web Tax increasing ads price and reducing ads sold at the equilibrium might, in the context of the model, improve investments and thus the quality of Web services for consumers. Although the total Welfare effect of the tax also hinges on the impact on advertisers and providers of the Web services, such specific effect is a novel finding and worth of consideration by policymakers. Further assuming a specific (isoelastic) functional form for the inverse advertising demand function, I found that a sufficient condition for obtaining a market equilibrium and increasing ads price after introducing a Web Tax, is a price elasticity of advertising demand larger than one. In the special case where quantities produced and sold coincide, and these react in sync to an increase in taxes, the magnitude of the impact targeting technologies have on the advertisers’ evaluation of ads determines the direction of the adjustment and, in conjunction with the value for price elasticity of advertising demand, determine whether there is over- or undershifting of the tax in case gross price increases with it. I also derived general conditions for the tax to be Welfare improving or deteriorating, though these are rather generic and would require better understanding of both the functional form and its parameters to be assumed in an applied policy analysis.

The present paper asks for targeted empirical work to assess the price elasticity of advertising demand and to quantitatively understand how much ads targeting technologies impact on advertisers’ reservation prices. This model provides guidance for a parametric evaluation of Web Taxes based on testable quantities, which hopefully will help inform the impact assessments of future policy initiatives. In very general terms, assuming that the number of Web companies and marginal costs are small, an effective matching technology and an isoelastic demand as in Eq. (16), the model suggests that overall a Web Tax might bring Welfare gains in the form of improved service quality for users and per-ad value for advertisers, without having a large impact on the amount of sold ads.

The model has several limitations which are worth mentioning. The assumption of symmetric competition is rather unrealistic and only provides a rough stylized approximation of the Web ads market. Also each matching system was assumed to be segregated from competitors, while in reality ads intermediation and networks may instead generate forms of spillovers between large Web companies, such that increases in the user base of one may reinforce value-per-ad in another.