On September 11, 2001, Internet users searched Google for information on the unfathomable events taking place in Manhattan. But the search “New York Twin Towers” came up with no hits. A full month had passed since the words “twin towers” had last been indexed. This meant that at this crucial moment, the term as a search object had not been updated. All the search results Google came up with were frustratingly irrelevant for the acute needs of the users. As an emergency solution, Google created a breaking news page featuring “News and Information on the US Attacks”, which was placed on top of the search list. The page featured links to the websites of news outlets and other news organizations as well as useful links to aid organizations, emergency aid, phone numbers of airlines and hospitals. This episode made Google, during the following year, develop a news filter as part of its search algorithm. This meant that current headlines now came up on top of the search list when a user entered the relevant search words.Footnote 1

A lot has happened since. For a long time now, as a company Google has gained an indispensable position as a public utility. The company offers everyone free information. Equipped with a smartphone or an Internet connection, anyone can get an immediate answer to any lexical question. But the information met by the user is no longer the same. It is now ordered according to radically different principles than what we have seen previously. Somewhat unwittingly, people all over the world have invited Google and the other tech giants in on the most private spheres of their lives. These giants see this development as a data race for relevance—relevance for the individual user and for the voracious advertisers.

The growing integration of digital experiences into everyday life is not the result of one isolated shift. It is the consequence of gradual changes. The Internet of today is often described as the third wave. First came the Web 1.0 of the 1990s, which was merely built around websites, email and simple search engines. Then came the Web 2.0 of the early 2000s with its expansion of blogs, wikis and social networks and where Google started to sell ads associated with search words.Footnote 2 Starting around 2010, the Smart Web 3.0 has taken over—this third wave is driven by big data and smartphones.Footnote 3 Big data refers to three overlapping ways of using data. It simply refers to the massive and increasingly available amounts of data. It also covers the analytical techniques used to extract useful information from data. And finally, it is associated with companies such as Facebook, Twitter and Google, who use extensive data analyses on user behavior as a core part of their business model.Footnote 4

Meanwhile, the algorithms on which tech giants base their businesses have become more and more complex. An algorithm is simply a rule-governed procedure aimed at solving a class of problems—parallel to a recipe where one keeps adding ingredients in the form of data. The term dates back to ancient Greek mathematics, the name derived from al-Kwarizmi, an Arab mathematician from the ninth century. Computer software became a readily available way to formalize, develop and further automatize algorithms for many different purposes. In the context of tech giants, the word refers to central complexes of computer software which govern searches and rankings of search results (Google) or networks of “friends” and followers and what information flows through a given network (Facebook, Twitter) etc.—all while gathering user data for advertising purposes.

Algorithms function on the basis of data inputs and require those inputs to be sorted in specific data categories. Just like a cake recipe requires rule-bound input of e.g. flour, eggs, yeast, sugar, etc. in a certain sequence, algorithms presuppose certain types of data, which it is able to recognize: An algorithm for the determination of prime numbers accepts integers as input only; an algorithm for birthday wishes requires dates as input; an algorithm for image comparison uses pixel sequences, etc. This implies that an algorithm requires a world categorized in specific data categories. If those categories are insufficient for the purpose at hand, there may be certain things wished for, which the algorithm may not be able to accept or express – or, conversely, there may be things unwished for, which it produces as output.

The algorithms are continuously updated by software engineers, and their ability to recognize, identify and categorize data is constantly trained through machine learning in which humans present the algorithms with a series of examples of a given category (e.g. pornography) for the algorithms to recognize. Part of such an “ability to learn” can also be automatized through so-called deep learning, where data are filtered through self-adjusting networks. For instance, it is possible to create several varieties of a given procedure (e.g. the same ad in different colors), which are then launched simultaneously in order to choose the one that turns out the most effective. During this process, big tech algorithms become ever more comprehensive and complex and it is unlikely that any one programmer can get an overview of them in their entirety. The tech giants have been criticized for only giving very few and specially skilled people—programmers—access to these algorithms, without seeking input from other types of experts.Footnote 5 Algorithms are manmade and thus not necessarily fair, objective or neutral—their categorizations may contain many different kinds of biases, intended or unintended, e.g. of race, gender, politics and many others. Cathy O’Neil has famously called such harmful kinds of models for “Weapons of Math Destruction”. The models are opinions embedded in mathematics and technology.Footnote 6

December 4, 2009 was a decisive day for the third wave of the Internet, as it marked a radical shift in how information is consumed. That day, Google launched a fundamental modification of its search function, “Personalized Search for Everyone”.Footnote 7 Up until then, the key to Google’s success had been the famous algorithm PageRank, named after one of Google’s founders, Larry Page. It ranked key word search results based on the number and popularity of connections a given website had to other websites. It was inspired by the ranking of scientific articles based on how many times they had been quoted. The most well-connected websites came up at the top of the search list—no matter who entered the search word. But from that day on, in 2009, the objective criterion was supplemented by subjective ones, tied to the individual user. That made the platform personalized, and since then, two users no longer get the same results from the same search. Now Google tracks user behavior to generate and store massive amounts of data based on geographic location, search history and other demographic information. Furthermore, personalization provides insight into user behavior based on the dizzying 3.5 billion searches entered every day.Footnote 8 New algorithms enable Google to come up with ever more sophisticated guesstimates about who the user is and what information is personally relevant to that individual. There are two fundamental purposes to this data collection. First, these data are supposed to bring to the top of the search list the news and search results that are most relevant to the individual user. But at the same time, these data help advertisers find people likely to buy their products. The latter is the key to understanding the tech giants’ business model. Ads targeted according to this information are bought by marketers, enabling them to present, on the Google list of pages found, ads which are adapted to the detailed personal preferences of users. In practice, this means that each user is presented with different, personalized versions of the Internet. Many people might still assume that when a word is googled, the results will be objective and the same for everyone—that they are simply the most authoritative results. That was indeed the case in the early days of search algorithm PageRank, but that standard version of Google is long since outdated.Footnote 9

Personalization, and the data collection that goes with it, has not only become the main strategy for Google but also for other tech companies such as Facebook, Twitter and many others. That same year, in 2009, Facebook introduced its likebutton, which made it possible for users to express a simple accept of some presented content or other. It also made it possible for the people who had posted the content to use the number of likes as a measuring stick of their individual popularity. Facebook and Twitter record an extremely detailed portrait of each individual user based on things like clicks, likes, words, movements and networks of “friends”.Footnote 10 Facebook COO, Sheryl Sandberg, explains the idea behind the strategy in this way: “People don’t want something targeted to the whole world—they want something that reflects what they want to see and know”.Footnote 11 This is smart and not least very convenient for the user. It enables you to dive directly into specialized news and stories about exactly the topics that are especially important to you—more easily than ever before.

Gradually, personalization has moved closer to many parts of an individual user’s life.Footnote 12 To begin with, only ads were tailor-made, then came news and then the entire flow of information: essentially, large chunks of the user’s online life. The business formula is simple: The more personally relevant information the tech giants can offer, the more personally targeted ads they can sell, and the higher the probability of the user buying the product offered.

From the moment news was adapted to the user, suddenly it was possible to get news in real time, customized to the individual user. Many people still get their news through TV, radio, newspapers or digital news sources other than social media. But the figures from “Reuters Digital News Report 2017”, mapping out news consumption in more than 36 countries, among them Denmark and the US, show that more than half of users—54% to be exact—get news via social media, with an even higher percentage among younger generations. In all countries involved, 33% of youth between the ages of 18 and 24 now primarily get their news from social media, and in many countries, news consumption via messengers such as WhatsApp are growing fast. In these statistics, Facebook is by far the most important source of news.Footnote 13 As early as 2007, founder Mark Zuckerberg boasted that Facebook might be the biggest news source in the world: “We’re actually producing more news in a single day for our 19 million users than every other media outlet has in their entire existence.”Footnote 14 Since then the number of users has increased a hundredfold. However, the word “producing” is overstating things quite a bit, given the fact that Facebook does not produce researched, investigative, fact-checked journalism. The company does not produce news in any standard public sense but limits itself to passing on news produced by and financed by other organizations—not to mention tidings of the more private kind about cats, food, love and hate, which users exchange among themselves. But even this softer news category is not produced by Facebook, but by its users. Even in its section Trending Topics, Facebook does nothing but forward news coming from others—some of which are important, some less so.

Personalizing the entire flow of information means that the tech giants’ algorithms have come to orchestrate large parts of the user’s life. The user consumes gossip alongside status updates, news alongside entertainment and ads. The idea is for the user to live and move inside the current: adding, consuming and redirecting information. In an interview with Wall Street Journal on the future outlook of the company, Google CEO Eric Schmidt notes: “I actually think most people don’t want Google to answer their questions […] They want Google to tell them what they should be doing next.”Footnote 15 It may sound bizarre, but Schmidt elaborates by using a leap of imagination. Picture yourself walking down the street. With the information Google has gathered about you, the company knows more or less who you are, what your interests are and who your friends are. Within only a few meters, Google also knows your exact location. Now imagine you need milk and there is a place nearby where milk is sold. This is the moment Google will remind you that you need milk. Moreover, Google will let you know if you’re also near a shop that sells precisely those horse track betting posters you recently searched for online. Or it will let you know if it turns out that the nineteenth century assassination you just read about took place across the street. In short: the objective is that the user will live in a world where personally relevant information is presented everywhere.

It is nothing new that information and news are adapted to the user. For instance, traditional media are often shaped according to opinions and interests of the individual reader, who belongs to a specific segment. Consider a phenomenon such as the sometimes heavily lauded Danish “four newspaper system” from the early twentieth century. It made sure that everyone read a newspaper published by the party they voted for—with news and opinions tailored to each party’s electorate. Today the difference is that it is possible to specify all the way down to the individual level. Secondly, the sender’s intention is profit, which is connected to the market, and not political orientation, which is connected to society. Naturally, the result of this adjustment is that the remaining media landscape may be ignored. Tech giants help you handle the infinite amount of information available online. They give you an individual news diet. The problem is that this individual tailoring of information automatically creates filter bubbles.Footnote 16 Filtering out all things with no direct immediate personalizable relevance creates a bubble of what one already knows, is already interested in, already likes. Generally speaking, a filter bubble has three characteristics: it causes isolation, it is invisible, and it is imposed.Footnote 17 The user may thus be left alone, placed in a sort of intellectual isolation. With search results and news feeds completely tailored to each user, we all see radically different versions of the Internet. Furthermore, the factors that determine the ranking of search results change constantly. This also happens with the algorithms themselves, many of which are programmed to automatically generate variations of themselves in order to get even better results in the form of attention and clicks. It is, therefore, impossible to know exactly how one user’s search results differ from those of others.Footnote 18 Because the filter bubble is invisible—in the sense that the tech giants avoid drawing attention to the filtering process—one may easily be fooled to believe that it is objective, neutral and true. That is not the case. As tech giants keep the black box of their algorithms a secret, it is difficult or even impossible to notice biases in it, that is, whether it leaves out crucial information. Who does the algorithm believe the individual to be? And why does it show the results it ends up showing?

Lastly, whether the user wants to live in a filter bubble or not is not up for discussion. The filter bubble is imposed—albeit to a lesser extent on Twitter. Neither Facebook nor Google allows for the user to make an active choice about how their world is filtered. Twitter is more open about the fact that it goes through the “profile and activity” of users in order to find out their interests and about the fact that they sell ads based on these interests. As opposed to the two other companies, Twitter allows to opt in or out. It is even possible to see how many advertisers are tracking the user. But even though it is possible to opt out of the “interest-based advertisements” in personalization and data settings, it is not possible to remove oneself entirely from the advertisers’ target audience. For instance, Twitter tracks which apps, apart from Twitter, are on the device and which websites (with Twitter integration) the user visits.Footnote 19 In the default settings this feature is enabled, which could be a sign that Twitter has already shared the information with potential advertisers. But at least Twitter has allowed for an opt-out. It should not be underestimated, however, that deciding default settings beforehand gives a certain power to the tech companies—a strategy many online companies use based on the observation that very few users consider their settings, not to speak of changing them. Rather, the users rush past the dire and complicated legal text they must accept in order to even get started. The users are busy, they have only so much attention available to make decisions, and in general they trust everything to be okay when no one else seems to take a deeper look at the settings. More often than not, the standard settings give the companies very deep access to data and digital traces left by the user. Such trust in the companies seems misplaced.

But why are filter bubbles such a big problem if it has the convenience of tailoring the information flow to one’s own preferences? Because the user risks being misinformed. It may lead people to believe that factually false beliefs are in fact true. Inside the filter bubble, users risk getting trapped—or trapping themselves—in a closed chamber of conspiracy theories, lies and “fake news”, designed especially for the individual user. For example, if you join an anti-vaccine group on Facebook, the algorithms will redirect you to many other groups which also flirt with conspiracies: Why not join an anti-GMO group as well? Or what about the Flat Earth Society (yes, it does exist)? Or how about a group that has the recipe for healing cancer naturally? The recommendations can drag the user into isolating communities, which see their own realities through their own “facts”. The system of algorithms may calculate that a given user is the anxious type, finding solace in conspiracies. This is likely to have dangerous consequences. In 2016, many Brazilians believed that the authorities lied when they said the dangerous zika virus could lead to damaged fetuses. Rumors were all over the social media, and no one was sure if the true cause of the damage was vaccines, pesticides or mosquitoes.Footnote 20 In Europe and the US, between 2014 and 2017, Russian bots and trolls spread misinformation about vaccines via Twitter. The campaign was an attempt to weaken public confidence in vaccination and expose users to contagious diseases. In August 2018, the World Health Organization announced that a record high number of people were affected by the measles in Europe. According to experts, this wave of infections is the result of a decline in the number of people being vaccinated. In the US, the number of parents refusing to vaccinate is also on the rise.Footnote 21 On Twitter, one might get the impression that this otherwise safe and efficient vaccine is extremely disputed. Theories and ideas found online lead both highly and less highly educated people to believe that they are capable of seeing through the pharmaceutical industry and professional health recommendations.

Inside the filter bubble, the users find confirmation in their already existing points of view, or at least in the un-activated ones rooted in exactly their particular personality type. This tendency is called confirmation bias, and today this is boosted by algorithms automatically sorting aside information that might challenge the user’s existing views of the world.

The filter bubble not only traps users inside a confine of already established interests and positions, but it also keeps them out of the other bubbles. Here, a person is no longer presented with alternative world views, let alone enabled to see how the bubbles of others actually look. You may not meet the worlds, ideas and arguments of opponents, be they political, religious or other. You may lose the beneficial habit of attempting to understand why others have the opinions they have and fall back on assuming that other viewpoints are simply crazy, stupid, pathological or evil. Sometimes, when outside bubbles, we may realize that an opponent is right, other times it’s necessary to know the details of the opponent’s position to be able to find compromise. The filter bubble does not support such crucial social options—quite the contrary.

The years 2015–16 saw a vicious example of filter bubbles at work. Google-owned video sharing service YouTube became a key place for the Alt-right movement to organize their extreme rightist views, of white supremacy, malicious racism and the like. This went unnoticed by many users, because it was happening in a growing bubble of enthusiastic users whose activity most other users did not discover—unless they actively searched for those kinds of videos. The YouTube algorithm sent viewers of the extremist material still more videos of the same kind, oftentimes even more extreme ones (more on this later). Some users with many followers were even paid a cut of the on-screen ad revenues their videos would generate. Selected super-users were even paid to produce more videos featuring their political extremism and upload them to their “preferred” YouTube-channel.Footnote 22 In the spring of 2017, this was revealed by traditional media, leaving YouTube with almost as much explaining to do as Facebook would have in 2018. Quickly, a number of patchwork solutions were introduced to reassure the public and advertisers, but even though YouTube probably no longer decidedly pays fascists for their views, the structural problem is here to stay: big, problematic, crazy movements can germinate in the shadows of tech giants without the general public finding out. The media ought to hire investigative journalists to continuously check a variety of dissuasive keywords on the different platforms in order to identify such filter bubbles.

Nevertheless, empirical studies of polarization show that the aforementioned problems caused by filter bubbles will be developed in full only in a future where the vast majority of news coverage takes place online. During the 2016 presidential elections in the US, the most important news sources were still television networks such as CNN and Fox News (which can also cause bubble effects, to be sure, if a person sticks to one single channel only). According to a 2017 survey, the groups of people in the US who spent the least amount of time online were also the groups who had seen the highest increase in political polarization from 1996–2012. Such a result is a compelling reason not to conclude that the full effects of filter bubbles are already here.Footnote 23 This does not exclude, however, the existence of bubbles related to knowledge, news, etc. Maybe the result tells us that people with less online experience are more prone to wind up in a filter bubble, whereas people who spend more time online have become more capable of withstanding the effects of bubbles? Despite these important findings, the general diagnosis of filter bubbles is presumably a forecast, admonishing a possible future trend.

Generally speaking, personalization has given the tech giants a very powerful position—they have indeed become a camera lens between user and reality. The user may live inside the filter bubble, where freedom of choice has been replaced by free selection between items on a highly personalized menu only. The filter bubble controls what we see and what we do not see. It is indeed convenient that Google comes up with exactly the right recommendation and presents ads for things the user is actually interested in. But it happens at the expense of the user’s freedom. In order for people to act freely, the future must be open and information freely available. In the data race for relevance, tech giants have in fact turned into predictability machines. To a large extent, the giants’ harvest of big data makes it possible to predict human behavior, and this means that the future can be calculated and controlled. In the third wave of the Internet, users risk having to give up their freedom in order to achieve convenience. This presents a big danger for freedom of expression, understood in the accompanying sense: freedom of information. This is not so much a matter of censorship and removal of certain topics—that happens as well, more on this later—but the fact that the user is led to believe that the individual’s bubble makes up the whole relevant world.