This paper investigates whether a search engine’s ordering of algorithmic results has an important effect on website traffic. A website’s ranking on a search engine results page is positively correlated with the clicks that it receives. This could result from the search engine’s accurately predicting the websites relevance to users. Or it could result from users merely clicking on the highest ranked links, regardless of the website’s relevance. Using a unique dataset, we find that a website’s rank, not just its relevance, strongly and significantly affects the likelihood of a click. We also find evidence that rank influences CTRs partly by controlling access to the scarce attention of users, but primarily by substituting the reputational capital of the search engine for the reputation of individual websites.
Recent anti-trust investigations of the internet search market in the US and Europe have considered to what extent search engines have the ability to influence traffic to websites. It is well known that the ranking (i.e. the hierarchical physical location on a search-results page) of websitesFootnote 1 is positively correlated with click-through rates (CTRs).Footnote 2 If this correlation reflected a causal impact of ranking on CTRs, then search engines with a large share of total search activity would influence a large amount of traffic to websites.
How could a correlation between rank and CTRs arise if there were no causal impact of the former on the latter? This might occur through reverse causation: The search engine might accurately predict the relevance of websites to users (and therefore their likely future CTRs) and then place websites on the page as a function of this prediction.
Using a unique dataset of individual search behavior we show that there is indeed a strong positive correlation between the rank of a website on a given search engine results page (SERP) and the probability that an individual will click on that website. Although part of this correlation can be explained by the predicted relevance of the website, there is a substantial direct causal impact even when this is taken into account.
We find that being at the top of the ranking in the algorithmic search results has a large and statistically significant causal impact on the odds of receiving a user click, and that moving the website from rank 1 to rank 2 on the same page decreases the odds of a click by between one third and two thirds depending on the specific search that is undertaken. We concede that no single statistical method completely eliminates endogeneity concerns; however, our results are robust and all evidence points to very high economic significance of the algorithmic rank.
There are no studies to our knowledge in the economic literature that estimate systematically the effect of rank in the algorithmic search results using individual user data. Athey and Meidan (2011) and Athey and Nekipelov (2011) are important papers on the analysis of user behavior in the paid results. Previous research has indicated that CTRs can increase markedly for results placed at the top of a results page compared to other ranks and pages (Smith and Brynjolfsson 2001; Xu and Kim 2008; Ghose and Yang 2009. Of these studies, Xu and Kim (2008) is based on a small-sample laboratory experiment, and Xu and Kim (2008) uses data on paid search from an advertiser. The closest study in spirit to ours is Smith and Brynjolfsson (2001), which uses individual data from an Internet book-purchase site, but this relates to purchases of homogeneous books rather than searches for heterogeneous websites.
Armstrong et al. (2009) and Armstrong and Zhou (2011) study the welfare implications of “prominence” in search markets. Among studies investigating the impact of placement on sponsored search results, Jerath et al. (2011) demonstrate the existence of a “position paradox” where advertisements at higher positions obtain more clicks, but this effect can be offset by a superior firm reputation. The paradox is that the superior firm may make higher profits from bidding for lower ranked positions. In a similar vein, Baye et al. (2012), using search data that were aggregated by retailer, consider the impact of product position and product reputation on the organic search results page on CTRs and find that both are important factors.
We proceed as follows. In Sect. 2 we describe the data selection process and provide summary statistics for our dataset. We also describe the nature of the search engine algorithm and provide descriptive evidence about the determinants of ranking. In Sect. 3 we demonstrate econometrically the effect of ranking on the probability of clicking on a website. Section 4 investigates the contribution of reputation and conspicuousness to enabling page rank to influence click probabilities. Section 5 concludes.
2.1 Query Term Selection
Microsoft Corporation provided us with access to the database of the Bing log files for individual user searches during November and December 2010 and January 2011. All search engines store data from user sessions in detailed logs. The Bing logs contain recorded observations for each of the millions of Bing user queries, including for each query: a record of the date and time; all websites that were displayed on the SERP generated from the search; each website’s position on the SERPs; and which websites were clicked. For each website that appeared in a set of search results, we know at what rank it appeared in each view and whether it was clicked on during that view.
In order to isolate the impact of website relevance from that of page rank, we need query terms where the website relevance to the user query remains reasonably constant during the time period of study, while the ranking of websites varies (even if only slightly). We also need to eliminate as far as possible other confusing influences. To find suitable data, we first categorized a list of available query terms and then eliminated the non-suitable categories until we arrived at a final list of queries.
A first type of unsuitable query is one that generates what are known are “highly monetized” results. For example, the query term “airline tickets” signals the intent to shop for airline tickets on-line and, because it is defined in generic terms, occurs with relatively high frequency. The intent to make a purchase and the high frequency make this query attractive to the advertisers and the results page is highly monetized: There is a large volume of ads. The ads distract from the algorithmic results and introduce more “noise” into the algorithmic click behavior data. In order to predict click behavior on the algorithmic results we would need to know all of the paid results as well (whose presence might well be endogenous). As a consequence, these queries are not suitable for our analysis.
A second type of unsuitable query is what is known as “superfresh”. Consider the query term “Obama approval rating”. The intent is to look for current news, and every day (sometimes every hour) a different set of websites will be most relevant and appear in the top ranks. This variability in website relevance, which we cannot directly observe and for which we cannot control, makes such query terms unsuitable for our analysis.
A third type of unsuitable query is “navigational”: Where the user has a prior intent to navigate to a specific website. An example of this is one of the most frequent queries—“facebook”—and the search results display the different subpages of this website. Although a large proportion of query terms have some corresponding domain name and thus could in theory be navigational, queries become unsuitable for our purposes only when such query terms regularly appear in among the top results on the page.
Finally, query terms that arise from non-uniform intent across users are also unsuitable. One example is the query “eclipse”. Based on the websites that are displayed on the results page, this search has at least three possible intents: to learn about a solar or lunar eclipse, to find information about a software product that is known as Eclipse, and to search for one of the Twilight Saga books with this title (which is a teenage vampire romance novel).
Thus, we manually sorted through an extensive list of queries, and found four query terms that were suitable for our purposes. In alphabetical order these are: “Free Movies”, “Fun Games”, “Phone Numbers” and “Sports”.Footnote 3 Although some of these query terms are now monetized, none were so at the time of our study. None related to newsworthy events that might have had an impact on relevance. None were primarily navigational, and none showed significant evidence of non-uniform intent.Footnote 4
Algorithms are sometimes patented (the Google PageRank algorithm is covered by U.S. Patent No. 6,285,999) and exact formulas are held as trade secrets. However, the general characteristics of search algorithms are known. The paper that introduced Google Brin and Page (1998) states that “Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.” The fundamental ranking techniques of a search engine algorithm depend on natural language processing of the content of websites, topological analysis of the connections between websites, and analysis of the interactions of consumers with search results, among other things.
A Search Engine algorithm proceeds in two steps: choosing the websites that match the query term and then putting them in ranking order. The first step uses keyword focused measures, which examine the placement and count of the query term words in a website name and anchor text.Footnote 5 Once the set of websites to be displayed in the SERP is determined, they are ranked using natural language techniques, static rankFootnote 6 and user behavior data, such as prior website traffic and prior CTR.
This obviously raises a concern about reverse causality: It may be previous CTRs that determine ranking rather than ranking that determines future CTRs. Based on discussions with the engineers who provided us with the Bing data, we believe that at the time of our study (11/1/2010–1/31/2011), and for our selected query terms, the Bing algorithm relied on website CTRs that were calculated over long prior periods of time, and was refreshed only occasionally. As we illustrate further below, fluctuations in the CTR over short periods of time do not seem to be a determinant in Bing ranking for the query terms that we selected.
During the study period, some instability remained in the relatively new Bing algorithm, which can cause variation in ranks and is most probably the cause of the variation in page rank in our data.Footnote 7 In addition, during this study period, the results of the Bing algorithm were not personalized to user characteristics, which further alleviates many potential data concerns.
2.3 Sample Statistics
Our sample consists of those websites that appear on Bing on the first SERP (in positions 1–10) for each of the four query terms considered. “Free Movies” resulted in views for 262 such distinct websites, “Fun Games” for 158, “Phone Numbers” for 322, and “Sports” for 996.
However, not all websites had views in all ten positions. As an illustration, Table 1 displays the top five websites (as determined by the total number of views for the time period of our analysis) for the query term “Phone Numbers”; they are displayed in the order of frequency of appearance in Rank 1.
For each of the five websites, Table 1 shows how many views each website had in each rank during our sample period, and what the website CTRs were in each rank. For example, website phonenumbers.com had 17,075 views in rank 1, and 29.5 % of the views resulted in a click-through (CTR is 0.295). The statistics for each query term show that being in the top rank is associated with higher CTRs for each domain.
In addition, the frequency with which the top three websites appear in the top rank is also often, though not always, reflected in the ordering of their CTRs when they appear in the second rank, suggesting that some of the ranking frequency may reflect perceived website relevance. In particular, two websites—phonenumber.com and whitepages.com—are competing for the top spot on the page. Phonenumber.com has 17,075 views in rank 1 (with top rank CTR of 0.295) and whitepage.com has 14,652 (CTR is 0.274): When one website is in rank 1, the other website is usually displayed in rank 2. Phonenumber.com is slightly more relevant to the user query, since it is being clicked on more often in nearly every rank compared to whitepages.com.Footnote 8 This is consistent with the observation that phonenumber.com is observed in rank 1 more often.
These data naturally raise the question of what triggers changes in ranking. In particular, we are interested in whether the data are consistent with our claim that changes in ranking are more likely to reflect random events than to have been triggered by prior changes in CTRs. To examine this further, Fig. 1 has the time series of the daily CTR (dotted line) and daily percent of views in Rank 1 (solid line) for the two leading websites for the “Phone Numbers” query.
Our main concern is whether the changes in CTR trigger the switch between the ranks for these websites. This does not appear to be the case. It is easy to observe the level change in CTR once a website is displayed in Rank 1 more often, and the changes in CTR appear to occur after—rather than to precede—the switch between the ranks.
However, visual inspection of Fig. 1 is not the way to settle the question. Our conjecture is confirmed by Granger causality tests that were run for both websites. The summary statistics of the sample used for the Granger causality tests can be found in Table 5, and the results of the Granger causality tests are reported in Table 6. Note that the proportion of views in which the domain appears in rank 1 may sum to greater than unity across websites: For instance, it would be possible for two domains each to appear in rank 1 whenever they are viewed (thus 100 % of the time), provided the domains are never viewed on the same page
To determine the direction of causality between daily percentage of views in which the website appears in Rank 1 and its daily CTR, we perform a Wald test for the null hypothesis that lagged values of the former can be excluded from a regression of the latter, and vice versa. For the “Phone Numbers” query, we can clearly reject the null hypothesis that prior page rank has no effect on current CTRs: The F-statistic for the exclusion of the percentage of time spent in Rank 1 from the equation for CTR is significant at 1 % for one domain and 0.1 % for the other. On the other hand, we fail to reject the null hypothesis that prior CTR has no effect on current page rank.
For the other queries the evidence is more mixed. For “Sports” the results are similar to “Phone Numbers” but at slightly lower levels of significance (5 %). For “Fun Games” there is no evidence of Granger-causality in either direction, while for “Free Movies” there is evidence of two-way causality for one domain and none for the others.
Overall, for two query terms we can clearly accept the hypothesis, suggested to us by Bing engineers, that prior CTR is not used to determine the rank of the website. For the other query terms there is evidence of possible influence of CTR on page rank for only one of the domains used. On balance the hypothesis of lack of reverse causality seems broadly plausible given the evidence available to us.
3 Econometric Estimation
In order to estimate the effect of page rank on click probabilities we use the multinomial logit model that was developed by McFadden and used for a large variety of situations in which users make a single choice from a range of discrete options. This means that instead of estimating determinants of CTRs over a given time period we estimate the odds that a website in a given page rank is clicked on, relative to a website in the baseline Rank 10, averaged across all SERPs that gave rise to a user click.Footnote 9 This therefore allows us to abstract from the many factors that can affect CTRs, such as time of day, since these factors do not vary between alternatives that are presented to the user in a given page view.
The results are presented in Tables 7, 8, 9 and 10 for the four query terms. For ease of interpretation the coefficients are presented as odds ratios, so that the effect of a given rank should be understood as the odds that the user clicks on a website in that rank divided by the odds of clicking on a website in rank 10. An odds ratio of 1 would therefore imply no effect: the rank in question was no more likely to be clicked on than is rank 10. Odds ratios less than one imply a negative effect, odds ratios greater than one imply a positive effect.
There is a large variation among the query terms in the magnitude of the rank effects, but the broad qualitative findings are remarkably similar. Table 7 gives the effect of rank without controlling for website relevance for each of the four query terms. We can see that being in rank 1 increases the odds of being clicked on, relative to rank 10, by between 11 times (for “Free Movies”) and 220 times (for “Phone Numbers”). This is roughly twice as large as the effect of being in rank 2, though the exact proportion varies somewhat between query terms.
There are two ways in which we control for website relevance. The first, as reported in Table 8 for each of the four query terms, is to control for the mean rank of a website over the whole sample period. This is based on the idea that the mean rank of the website does reflect the search engine’s estimate of its likely relevance to users, while deviations within the sample period from this mean rank do not reflect variations in likely relevance.
Our “Mean Rank” variable is the inverse of the arithmetic mean of the rank number, so that higher values of the variable reflect higher ranks (ie those closer to rank 1). Controlling for Mean Rank lowers the odds ratio for rank 1 by over half for all queries except “Free movies”, where it has a small lowering effect.
Our second way of controlling for website relevance, as reported in Table 9 for each of the four query terms, is to use a dummy variable that we call “Brand” for any website that appears in rank 1 during the sample period more than 0.5 per cent of the total number of SERP observations.Footnote 10 This definition captures the idea that such websites are likely to be perceived as more relevant. Adding this variable to the specification that includes Mean Rank reduces further to a small extent the odds ratio for rank 1, except for “Phone Numbers” where it increases the ratio slightly, probably due to collinearity with Mean Rank.
As a robustness check we use separate fixed effects for each of the “Brand” websites instead of a single dummy variable, as reported in Table 10 for each of the four query terms. This lowers substantially the coefficient on Mean Rank, turning it negative in three cases out of four, without substantially altering the coefficients on Rank. This appears to indicate that the fixed effects and the Mean Rank variable are substantially collinear.
Overall, it is striking that even after these controls for relevance there is a large, statistically and economically very significant effect of being in rank 1 as compared to rank 10. Even in the most conservative specification (number 3), the odds ratios vary from around 9 (for “Free Movies”) to over 120 (for “Phone Numbers”), and this effect is at least 50 % higher and sometimes more than twice as high as the effect of being in rank 2. The effects also decline as rank declines, roughly but not strictly monotonically.
4 Forces Behind the Impact of Rank
If page rank exerts a strong causal influence on the likelihood that users click on a website, what is the reason for that effect? In particular, to what extent is it due to the fact that higher ranked websites are more conspicuous on the page, and to what extent is it due to the reputation of the search engine for delivering relevant results in the higher ranks?
To explore this question we make use of a simple insight: The reputation of the search engine for relevance will be a substitute for any reputation for relevance that the website may have in its own right. Websites with strong positive reputations will require less assistance from the reputation of the search engine.
We can therefore compare two alternative models of the process by which users search: In the first, reputation-based model, users compare all the domains that appear on a page and decide which is most likely to meet their needs, based on combining information based on the page rank (given the reputation of the search engine for reliability) with information based on the domain’s own reputation. In this case we expect to see that the higher the reputation of the domain in its own right, the less additional benefit it will gain from being in a high rank.
In the second, conspicuousness-based model, users begin at the most conspicuous point on the results page (typically though not necessarily the first rank), and decide whether to click or to continue to the next result. In this setting the reasons why a user will click immediately may be situational (such as that the user is in a hurry) or based on recognition of the domain as one with a good reputation for relevance. In this model of sequential choice, the websites with high own reputations will benefit more rather than less from being in a high rank. They have more to gain from being brought to the user’s attention since they are more likely to hold such attention and convert it into a decision to click.
This suggests looking for interaction relationships between our rank variable and our separate measures of website relevance: Mean Rank and Brand. If the positive impact of being in a high rank is due principally to reputation, we should observe a smaller additional effect of reputation (as measured by our relevance indicators) for websites that appear in the higher ranks. Conversely, if it is due principally to conspicuousness, we should observe a larger additional effect of reputation (as measured by our relevance indicators) for websites that appear in higher ranks.
Tables 11, 12 and 13 explore this question by interacting our relevance measures with page rank for each of the four query terms. For both Mean Rank and Brand, we include an interaction term for the variable for the first two ranks only.Footnote 11 If the coefficient on this interaction variable is greater than one, relevance is more important for websites in higher ranks; if it is less than one, relevance is less important in higher ranks.
The results tend to indicate a lower effect of Mean Rank in the top two ranks than in the remaining ranks, but the evidence is not unequivocal. For two out of four query terms the interaction term is strongly and significantly \(<\)1, while for one other query it is insignificantly less than one and for the other it is insignificantly greater than one. For Brand three of the interaction terms are less than one but only one is significantly so, while the other is significantly \(>\)1. When both sets of regressors are included together, no clear pattern emerges, though the coefficients indicate the likelihood of significant collinearity.
On balance, the evidence is suggestive rather than conclusive. Nevertheless, it suggests that reputation is a stronger force than conspicuousness in explaining the causal impact of page rank on click probabilities, but that conspicuousness has a role to play as well.
We have shown in this paper that when a website appears in a high rank on a Search Engine Results Page it has a substantial and highly significant positive causal effect on the probability that a user will click on the website. We have done so using a unique data set that allows us to abstract from the fact that search engines determine rank partly by predicting the likely relevance of websites to user needs.
We have shown that this estimation is robust to possible concerns about the endogeneity of page ranking. We have further provided evidence that suggests that rank influences CTRs somewhat more by substituting the reputational capital of the search engine for the reputation of individual websites. However, there is also some evidence that conspicuousness plays a role as well, which implies that one of the assets that search engines deploy is access to the scarce attention of users.
We use the term website to describe the search result hyperlink and associated domain name that when clicked takes a user to the webpage that is associated with that hyperlink.
Search engines display two types of results: These are often called paid results and algorithmic results. Our analysis focuses on algorithmic results which are results that are ranked based on how relevant the search engine infers them to be to users (rather than based on payments from the website).
In our data we identify blended search results (those compiled by the search engine, usually with multiple links and an image in one installment), and omit SERPs in which blended search results occupy any of the top three ranks. We omit the SERPs that have two or more clicks to different websites on the same page, and count two or more clicks to the same website on the same SERP as one.
Only “Sports” had the corresponding domain “sports.com” appearing among the top five domains, and it gathered many fewer views and clicks in the top three positions than did the top three domains. The only evidence we could find of non-uniform intent in our four queries was the appearance of the domain “Wikipedia.com” in the results for “Phone Numbers”, and this domain also appeared very rarely in the top three positions.
Anchor text is the text in a hyperlink that leads to the website and website content.
Static Rank is computed based on the ontological map of all web pages, consisting of nodes and links between them. Given these interconnections, Static Rank assigns a score to each website. This score represents the probability that a person starting at a random page and randomly clicking on links will arrive at the website in question.
Variation in ranking can be caused by maintenance operations on some of the servers, for example.
The exception is rank 5 where phonenumber.com has six views and a CTR of zero (zero clicks out of six views) and whitepages.com has 10 views and a CTR of 0.1 (one click out of ten views). This difference seems to be attributable to chance variation, given the miniscule number of views and clicks involved.
We use Rank 10 as the baseline since ten is the maximum number of results that appear in a single page view.
Our formal definition is that it appears more than 0.05 % of the total domain-rank observations; since not every SERP has the full ten ranks this is almost but not quite equal to 0.5 % of the total SERPs. We have experimented with different percentage definitions and nothing of importance depends on the precise proportion.
Interacting the variable for more than two ranks creates collinearity problems since a large proportion of the observations are dominated by the presence of domains that appear in the top ranks.
Armstrong, M., Vickers, J., & Zhou, J. (2009). Prominence and consumer search. RAND Journal of Economics, 40, 209–233.
Armstrong, M., & Zhou, J. (2011). Paying for prominence. The Economic Journal, 121, F368–F395.
Athey, S., & Meidan, M. (2011). Exchange rate fluctuations, consumer demand, and advertising: The case of internet search. Working paper.
Athey, S., & Nekipelov, D. (2011) A structural model of sponsored search advertising auctions. Working paper.
Baye, M., De Los Santos B., & Wildenbeest, M. (2012) Whats in a name? Measuring prominence and its impact on organic traffic from search engines, Kelley School of Business BEPP 9.
Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30, 107–117.
Ghose, A., & Yang, S. (2009). An empirical analysis of search engine advertising: Sponsored search in electronic markets. Management Science, 55, 1605–1622.
Jerath, K., Ma, L., Park, Y. H., & Srinivason, K. (2011). A position paradox in sponsored search actions. Marketing Science, 30, 612–627.
Smith, M., & Brynjolfsson, E. (2001). Consumer decision-making at an internet shopbot: Brand still matters. The Journal of Industrial Economics, 49, 541–558.
Xu, Y., & Kim, H. W. (2008). Order effect and vendor inspection in online comparison shopping. Journal of Retailing, 84, 477–486.
The analysis for this paper was performed in conjunction with ongoing work for Microsoft Corporation; Keystone and Seabright acknowledge financial support from Microsoft. The primary data that were used in the analysis come from log files from Microsoft’s Bing search engine and include the website names, search result rankings, and click-through rates for results that were presented in response to user search queries on Bing.com. Supplementary data for additional website click-through rates were derived from additional Microsoft opt-in consumer panel data that contained online behavior.
The authors are grateful to Elan Fuld, who provided excellent research assistance; to Scott Gingold, who was involved with the project from the beginning and gave us invaluable feedback, ideas, encouragement and support; and to Thierry Magnac, who gave us invaluable econometric expertise and spent much time helping us to understand the data.
About this article
Cite this article
Glick, M., Richards, G., Sapozhnikov, M. et al. How Does Ranking Affect User Choice in Online Search?. Rev Ind Organ 45, 99–119 (2014). https://doi.org/10.1007/s11151-014-9435-y