Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Programmatic Advertising allows advertisers to bid for single advertising impressions, i.e., each time a user visits a website advertisers can decide whether they would like to bid for the opportunity to being displayed to that specific user and at what price. Programmatic Advertising, which emerged around 2009, thereby comes with a huge amount of data that can be used for decision making purposes (e.g., bidding). This article will provide an overview of the two fundamental decision making fields in Programmatic Advertising: budget allocation across the media mix and micro decision making in Programmatic Advertising ad auctions at the individual user-level. In this article, we outline state of the art modeling techniques used in both decision making areas as well as the specific challenges faced by analysts when developing models. In addition, we present common heuristics used by practitioners and potential drawbacks related to the use of heuristics vs. statistical models.

1 Evolving Media Usage

In the past, media usage typically consisted of listening to the radio while reading the newspaper over breakfast and watching a game show on TV in the evening with the family. However, media offerings and how they are used have changed tremendously in recent years. First, the number and granularity of media channels have exploded over the last 20 years. At the same time, different media channels are converging, e.g., the line between broadcast and IP-based television programming is blurring. Second, simultaneous usage of multiple media offerings at the same time has become a widespread phenomenon, called “second screen”. In 2014, according to a study by InMobi, more than half of viewers use smartphones or tablets while watching TV (66 % of the users in the US, around 60 % of the users in Germany and the UK).

Growing media diversity and the capability to quantify advertising response in new media offer advertisers new ways to communicate with customers and optimize communication across media channels. Traditional media planning, meaning the allocation of vast budgets to a few media channels (i.e., TV, radio, print), changed irrevocably when search engine advertising (SEA) and Programmatic Advertising moved in along with the evermore fragmented world of publishing. So now advertisers need to answer the questions: Which media channels should we invest in? How can we quantify the success of individual channels? Should a specific user see one more ads or is the number of ads shown to the user already sufficient to convert her? What kind of ad should we show the user and how much money should we bid for the opportunity to show her our ad?

2 Why Measuring Advertising Effectiveness in Programmatic Advertising?

Analyses of the effects of advertising have been conducted in scientific and practical applications for more than 50 years, with the intention of determining variables that impact advertising response and aiding in decision making regarding the allocation of budgets.

Online advertising was added to the mix about 15 years ago, with the mantra of unlimited measurability of advertising response. For more than a decade, new business models emerged and media agencies focused on so-called performance marketing measuring and analyzing impressions, clicks, and conversions. While Programmatic Advertising is still rather new and emerged around 2009, we already see the development of various management heuristics that also seem to create the illusion of complete measurability and controllability. However, it has become apparent that, even with the extensive tracking options available, it is impossible to fully measure the impact and effectiveness of advertising without uncertainty.

One of the key questions when deciding on the objective of measuring advertising effectiveness is: Which decisions should and can be made to manage ads based on their effectiveness? This question is closely linked to the two decision levels shown in Fig. 1. The upper level deals with the dynamic allocation of budgets across channels, i.e., how much of the total budget should be spent on Programmatic Advertising and when. These allocations frequently use performance indicators such as cost per mille, GRP, cost per click, and cost per order, that are rather easy to calculate. Such indicators can be compared across channels when working towards a certain goal, such as maximizing the number of new customers given a specific budget. However, they are merely supplemental aids applied to support decision making and do not necessarily lead to optimized advertising effectiveness.

Fig. 1
figure 1

Measuring advertising effectiveness and decision making levels (PA stands for Programmatic Advertising)

Finding a statistically sound solution to optimal budget allocation is extremely difficult. It assumes that it is possible to consider the marginal costs (saturation effects) of the individual channels as they relate to the budget and to time. Budget allocation over time, also called pacing, has become an important task in Programmatic Advertising. Heise et al. (2014), for instance, develop a profit-maximizing pacing algorithm that allocates budgets taking into account a time slot’s (e.g., an hour) profitability. This, however, requires that analysts are able to estimate each time slot’s profitability based on clickthrough and conversion rates. But often the data as well as the skills required for decision making in that area are not available (Dinner et al. 2011). So deciding on proper budget allocation across channels is often based on experience, the performance indicators mentioned above, and on “gut feeling”.

The second decision making area lies in controlling advertising impressions at the individual user-level and closely relates to so-called customer journey analysis. The fourth section of this article presents statistical models that can be applied to support decisions at the individual user-level.

In summary, there are two essential objectives of measuring advertising effectiveness: One is to allocate the available budget across channels and time (upper level of Fig. 1), and the other is to best manage advertising at the micro-level (lower level of Fig. 1).

3 Challenges and Evaluation Criteria of Methods

The development of models and methods to evaluate and manage Programmatic Advertising is associated with numerous challenges. When evaluating the different methods available to the analyst, it is crucial to understand whether and to what extent various challenges are addressed by these methods. We start by describing the challenges analysts in Programmatic Advertising are faced with:

Huge Amount of Unstructured Data

For the advertisers, simply tracking ad impressions means collecting vast amounts of data on display advertising. Programmatic Advertising adds another layer, because participating in a Programmatic Advertising auction does not always lead to an ad impression. Let’s assume that an advertiser wins only one in 20 auctions, it would mean that 20 times more data is generated than with traditional display advertising. Each Programmatic Advertising bid request also contains extensive data on placement. The applied methods must be able to deal with such vast quantities of data – easily 100 GB per day – and be able to process it (Stange and Funk 2014).

Few Primary, Many Secondary Attributes

The data available today is characterized by large amounts of data sets, whereby each individual data set possesses only a few attributes. But because of the many manifestations of each attribute (e.g. the many publishers, peripheries, themes) and time structure of the data, a multitude of secondary attributes can be derived from this data. In order to make sense of the data, it is reasonable to categorize the various attributes: For example, instead of specifying a certain publisher, the analyst should use a context categorization (e.g. financial context instead of Wall Street online) in the evaluation of the advertising contact. The time structure of the customer journey also offers virtually unlimited ways to derive secondary attributes, such as: How active was the user in the last hour? How many of the advertiser’s ads has the user seen in the last 30 days? How did she react to the ads (click/search/on-page)? The extent to which statistical models allow the advertiser to interpret their outcome is primarily a function of the secondary attributes applied. Choosing the right model requires experience, knowledge of user behavior, and an experimental, ongoing process.

Data Does Not Show the Whole Picture

The data gained from Programmatic Advertising can only provide a limited view of the users’ behavior. Even if there is other data, e.g. socio-demographic, available in addition to the customer journey data, a prediction of how users will behave in the future based solely on data is very uncertain: There is never a complete picture of the user, her preferences, other (offline) influencing factors and her environment. Thus, there are two requirements that suitable methods need to fulfill: First, the model needs to be able to consider data that is not directly related to the customer journey (e.g., printed ads in offline channels, weather data, competition). Second, the method needs to allow for determining the uncertainty/predictive power of the model.

Heterogeneity and Dynamics

Every user reacts differently to advertising. Thus, the ideal advertising intensity is different for each user. In the same way, the number of contacts in Programmatic Advertising and other channels needed to convert a user varies. In addition, the users can be divided into a group that responds positively to advertising and one that resists it (Nottorf 2014). Amongst users who respond positively to advertising, the probability that they will react to an ad rises with each contact. In the other group, which is typically larger, the probability decreases with each contact. Methods have to take into account this heterogeneity across users in order to be able to make optimal decisions. Furthermore, user behavior is not static; it changes over time – in the long term as well as seasonally. For Programmatic Advertising, this means that the model-based prediction of a user’s click and purchase probability not only has to be re-calculated with each click, the model itself might change over time.

Cause and Effect

Managing advertising is not a controlled experiment in which treatments can be implemented and steered independently of one another. Rather, advertising success and managerial action are mutually dependent resulting in potential endogeneity problems. In addition, advertisers synchronize various media channels to a certain extent such as search engine ads, Programmatic Advertising, and TV leading to the so-called problem of collinearity, which in the worst case makes it impossible to answer the question of which treatment resulted in which success. Good methods identify the problem of collinearity and show the associated uncertainty around the impact of a specific medium (Note: Only specific field experiments that vary the constellation of the media channels can solve the problem; in contrast, a method can merely reveal the problem).

Methods and the models used must deal with these challenges and Anderl et al. (2013) have derived the following requirements for models trying to quantify advertising effectiveness:

  • Objectivity (Model specification is transparent, calculation is data-based)

  • Predictive Accuracy (Future user behavior is predicted as precisely as possible)

  • Robustness (Re-calculation and slightly modified input data produce robust results)

  • Interpretability (Results of analysis can be applied at the lower level to manage ads for individual users, or they can be interpreted as the impact of individual media channels to guide budget allocation)

  • Versatility (New media channels, data types, and influencing factors can be taken into consideration without extensive effort when assessing the model)

  • Algorithmic Efficiency (Scalability: When calculating and applying the model, computational complexity increases only moderately with the amount of data analyzed).

4 Models that Support Decision Making

4.1 Heuristics

In practice, various heuristics are used to evaluate the performance of Programmatic Advertising and other online advertising channels. Schröter et al. (2013), for instance, describe a commonly used heuristic for attributing the advertising success to individual contacts and media channels. The approach uses those customer journeys that facilitated successful advertising to calculate the contribution of individual contacts (e.g., aggregation at the level of individual media channels or publishers). In the simplest case, each contact that is part of a customer journey containing n contacts is attributed with an n-th proportion of the success. The alternative is that contacts are attributed with different amounts of the success of advertising as a factor of their position in the customer journey, e.g., contacts at the beginning and end of the customer journey are considered to have had more impact leading to the so-called bathtub model. Another model attributes greater success to the last contacts, which emphasizes the lasting effect of ads in a user’s mind. If the price per contact is also considered, this model can be used to calculate costs per order (CPO) for each contact. These costs can then be used when submitting bids in Programmatic Advertising or to allocate budgets to Programmatic Advertising. In practice, this approach is closely linked to the comprehensive term “attribution modeling”.

On the one hand, this type of analysis of advertising effectiveness is simple and easy to apply, which explains its widespread use in practice. On the other hand, there are some problems with this approach and it does not completely meet the requirements stated in the previous section: First of all, the success of a certain position in the customer journey can hardly be determined a priori and usually goes beyond management’s gut instinct. This, however, means that it remains unclear, which of the different attribution models might be the most suitable one. Second, time aspects of the customer journey are not depicted adequately, i.e., it makes no difference whether the first contact happened yesterday or 30 days ago. Third, the impact of different types of interactions (view, click, on-site activity) cannot be determined; they can at best be considered based on hypotheses. Fourth, the analysis does not compare successful and unsuccessful customer journeys, it looks at only the former. So the essential information as to whether the success is statistically significant and the direct result of the advertising contact is lost. And fifth, offline channels and other factors not ascertained during the customer journey cannot be examined.

In summary, the approach described above can aid in the initial assessment of advertising effectiveness in various channels and sub-channels (e.g. publishers in real-time), but it is not suited to support a bidding strategy at the level of individual users. Several rules-based methods are applied in practice. The associated rules are rather diverse (Schröter et al. 2013) and will be explained using three examples: First, retargeting addresses users who are already aware of the offering of the advertiser by visiting the website or through active interaction with other advertising channels. Second, if there is socio-demographic data available (usually compiled by third-party suppliers), it can be used to address users whose profile fits the definition of the advertiser’s target group (e.g. gender, age, household income). Third, a recent innovation is offered by agencies, which use Programmatic Advertising to deliver display advertising that is precisely synchronized to the broadcasting of TV ads, increasing the probability of reaching users watching TV and using a second screen at the same time. What these three bidding strategies in Programmatic Advertising have in common is that bids are submitted following specific rules. In that sense not individual users are addressed but groups with similar characteristics. This differs from the statistical models used to control bidding described in the next section.

4.2 Statistical Models

The purpose of statistical models typically is to predict user behavior, e.g., the purchase of a certain product. In general, these models are used to calculate the impact of different alternatives (e.g., different ad designs) that can be chosen to achieve a certain goal. So the probability of a user behaving in a certain way in the future is calculated based on his behavior in the past and on the potential decision making alternatives available to the advertiser – shown mathematically: p(future behavior|previous behavior & decision making alternative). Then, the alternative is selected that maximizes the probability of the desired future user behavior, taking into consideration the advertiser’s costs associated with the specific option.

An example: Let’s assume that an advertiser has decided about a bid for a user that she knows. The advertiser knows which ads the user has already seen and how she reacted to them (e.g. visit to website, active search). Using a statistical model, the advertiser can now predict how the additional appearance of the ad in Programmatic Advertising will affect the probability that the user will purchase a certain product. The advertiser then makes a decision based on the anticipated profit margin and the cost of the ad.

There are different approaches to developing statistical modelsFootnote 1; two of them will be explained briefly. The model developed by Nottorf and Funk (2013, 2014) for Programmatic Advertising is based on the previous work of Chatterjee et al. (2003). The influence of advertising on the customer journey is represented by short-term and long-term effects (Fig. 2). The short-term effects X actual represent the interactions of user i with advertising within the last hour. This can be any of hundreds of different possibilities (e.g., visual ad contact within the last hour, number of clicks or searches, on-site activity, interaction term such as “User first saw banner and then went to website via SEA”). Y actual stands for advertising contacts from longer ago (e.g., last 30 days). The model (Fig. 2) can also be used to specify influencing factors not directly related to a single customer journey (Z actual ), e.g., a competitor’s printed ads or one’s own TV campaign (e.g. www.adference.de). As described in Sect. 3, specifying suitable variables is one of the essential tasks in developing the model. A case study offering a good starting point for the development of a model for a specific advertiser is that conducted by Nottorf (2014). This model is special in that it allows for differences in the way users react to advertising (heterogeneity, see above). On the one hand, this offers greater flexibility and more accurate predictions, but it also means that more effort is required for assessment of the respective model (determination of unknown parameters αi, βi, γi). It is not necessary to assign the weights in the customer journey as is the case in attribution modeling (Sect. 4.1); instead the data is used as the basis for estimation.

Fig. 2
figure 2

Statistical model for predicting the conversion probability of a user i at a specified time t (PA again stands for Programmatic Advertising)

Once a model has been assessed, it can apply a “new” user’s data within milliseconds and can be interpreted in the sense of a recommendation in Programmatic Advertising. In general, this method is considered a classification process. There are many statistical data mining methods – e.g., neural networks, support vector methods – that can be applied as soon as the variables have been specified. The forecasting quality of the models developed tends to have less to do with the selected method and more with the astute definition and selection of variables – taking into consideration all of the available information on the advertiser, the competitive environment, and customer behavior as well as the service/product advertised.

The second approach to be explained here is calculation of so-called Markov models. These are also intended to predict the probability of desired user behavior (see above). In general, Markov models describe potential transitions from a system’s states and their probabilities. They can be shown as a graphic network model, with the nodes representing the states and the edges the transition probabilities. The order of a Markov model indicates how many of the states previously visited have an effect on the probability of transition to the next state, i.e., the likelihood of transitioning to the next state depends solely on the current state. A second-order Markov model takes into consideration the current and the previous state.

How can these models be applied to Programmatic Advertising? In a simple model, interaction with a media channel (e.g. Programmatic Advertising or SEA) as well as purchasing a product are both interpreted as states. The customer journey can be considered as a series of transitions between states. The transition probabilities between the states can be calculated using historical customer journey data, allowing the purchase probability at any point in time to be estimated for a customer journey that has not yet been assessed. Archak et al. (2010) suggest using first-order Markov models to model the impact of decisions related to online advertising (submitting a bid in Programmatic Advertising) applying the removal effect. This is done by removing the node in question, in this case Programmatic Advertising, from the network model and then using the remaining network to calculate how the purchase probability changes. If adding the node (placing an ad) has a positive effect and the cost is justified, a bid is submitted. Anderl et al. (2013) apply higher-order Markov models and use case studies to prove the forecasting quality of their approach.

5 Conclusion

Analysis of advertising effectiveness in real-time is still in its infancy, so it should not be seen as a project but as an ongoing process (a process requiring extensive effort).

To successfully establish continuous and action-oriented analysis of advertising effectiveness, advertisers must be aware of the objective and must determine the decision making level of online advertising (budget allocation across channels or placement of ads for individual users) at which the results of the analysis should be used.

The analysis of advertising effectiveness as well as its interpretation and use pose a number of challenges to advertisers. The requirements described in Sect. 3 can serve as a general guideline for selecting suitable methods and models. Caution is essential when applying and interpreting simple management heuristics and their promised cure-alls. Identifying significant control parameters in Programmatic Advertising and predicting user behavior will luckily remain uncertain and at the same time exciting for practical applications as well as for research.