Keywords

1 Introduction

Much like the previous Industrial Revolutions, the Digital Revolution is sure to have enormous impact on society, at many different levels. By learning from the unparalleled amounts of individual-level data that is currently shared and collected, machines will be increasingly able to identify patterns, create profiles, predict behaviors, and make decisions. Therefore, it is fundamental to understand the limitations of these tools to anticipate and minimize negative consequences.

In this chapter we focus on machine-learning (ML) models, particularly recommendation (or recommender) systems (RS), and how their use in decision-making processes can offer better services but also create important risks. Typically, RS refer to algorithms that recommend some item X to a user A, very often consumption goods (such as recommending a book the algorithm identifies as matching our interests) or in the context of social networks (a new friend or post); however, here we use this term in a broader sense, to refer to any algorithm that uses large datasets on people to identify similarity matches and recommend decisions, in many different contexts. First, we describe how RS work and their unavoidable limitations. Second, we focus on RS that work as intended and discuss how the creation of individual profiles can lead to abusive targeted advertisement and even to threats to democracy, from disinformation to state surveillance. Third, we describe what happens when these systems are faulty, but are still used to make probabilistic generalizations and aid in AI-based decision making. We will offer specific examples of how mistakes in data selection or coding might lead to discrimination and injustice. In the last section, we summarize some ideas on how to make AI more accountable and transparent and argue that the important decisions ahead should not be made by a limited group of non-elected AI leaders, but it should be the role of AI experts to raise awareness of such threats, paving the way for important regulatory decisions.

2 Recommendation Systems

The general goal of a recommendation systems is to predict, as accurately as possible, a new item to a user while optimizing for the rate of acceptance (Resnick and Varian 1997). These systems leverage information on users (demographic, past choices) and/or on items (for example, movies) to find accurate matches between them (ex: if you liked movie X you might like movie Y, or people “like you” have enjoyed book Z). At the core of RS is the assumption that items and/or users of a service can be mapped in terms of their similarities and that person A (or item X) can serve as proxy for person B (item Y). In that sense, a recommendation system suggests items that are closer in such similarity space to a user’s past choices or revealed preferences and is only as good as it can provide the most accurate recommendation to a user (person A will actually enjoy movie Y and book Z).

Over the past decades we have seen an increase in use of ML/AI techniques to support the development and implementation of faster, more reliable, and more capable RS (Fayyaz et al. 2020). These are possible because of (a) large accumulation of data about users’ past choices; (b) large datasets on details about items, and (c) increasingly sophisticated algorithms that take advantage of such data and take value from growing numbers of features and instances.

In general, recommendation systems can be divided into three big families: collaborative-base filtering (that tries to predict whether person A will like product X based on the preferences of “similar-person” B); content-based filtering (that tries to predict whether person A will like an item X based on person A’s past revealed preferences for similar items); and hybrid systems, that combine both. In terms of the algorithms used, these are further divided on whether they are supported by heuristic- or model-based approaches.

Collaborative filtering (Herlocker et al. 2000) recommendation systems rely on the similarity between users to perform recommendations. That is, if user A and B are similar, then the past choices of B can shed light on what to recommend to user A. Hence, the core technical challenge is to estimate similarities between users or items from data on the revealed past preferences of users (ex. past favorite movies). This approach has been widely popular on web-based portals such as Netflix and Reddit where users’ characteristics and up or down votes are used to estimate similarities. Popular algorithms range from Graph models of social networks (Bellogin and Parapar 2012) of similarity between users and Nearest Neighbor to the use of Linear regressions, Clustering techniques (Ungar and Foster 1998), Artificial Neural Networks (He et al. 2017) and Bayesian networks.

Often, auxiliary data is used to either improve collaborative filtering systems, or to overcome some of its limitations. Context information (e.g., location or time) can help systems achieve higher success and, in scenarios that have more users than items, recommendations are often done through an item-item similarity. Moreover, when past information on user activity is scarce (e.g., in the case of a new user of a new service), users’ information about their social relationships and characteristics (e.g., gender, age, income, location, employment, etc.) can help these systems establish a similarity even without specific historical activity.

Content-based filtering (Lops et al. 2011; Aggarwal 2016) does not require information about users, instead it maps similarity between items to perform recommendations. In other words, users are recommended items similar to their past choices (person A likes vanilla milkshakes, thus might also like vanilla ice-cream). Algorithmically, these problems are approached using techniques that range from TF-IDF (Rigutini and Maggini 2004) and clustering for topic modelling and inference, but also using classification models based on Bayesian classifiers, Decision trees, and Artificial Neural Networks. A traditional application of these techniques is in book recommendation engines that measure content (Mooney and Roy 2000) similarity between books.

Hybrid solutions (Burke 2007) combine aspects of both content and collaborative filtering. They arise in situations where it is practical and beneficial to develop meta-algorithms to balance the recommendation stemming from a collaborative- and content-based systems. These increasingly use complex deep learning algorithms and are common in social media recommendations, including newsfeed content, advertisements, and friends (Naumov et al. 2019).

There are several limitations and problems associated with the development of RS. From the technical perspective, problems can arise at two extremes of the spectrum. First, lack of initial data can lead to a “cold start”, that prevents the setup of the entire recommender system (ex. new movies that have not yet been rated by anyone or are from little known studios or directors) or limit the recommendations that can be given to new users. Second, when there is a “sparsity problem” and the number of items to be recommended is very large, the algorithm might lack scalability and users keep seeing the same few recommendations, either because they are the few most rated or because the individual users only rated a few (Adomavicius and Tuzhilin 2005). Third, and more conceptually, the implicit assumption in ML/AI solutions that the future can be predicted from past actions, renders them awkwardly unable to perform under novelty (e.g., expanding a service to a new cultural setting). Fourth, some models described above can learn from past mistakes (user A hated book Z after all) and, therefore, improve continuously, but this is not the case in several other examples of RS, sometimes with dire consequences. Important examples of the impact of the listed limitations will be discussed in more detail in the following sections.

3 When Recommendation Systems Work

3.1 Implications for Consumption

Although stemming from a seemingly intuitive and simple problem, recommendation systems have matured to highly complex algorithmic solutions that are able to leverage a multitude of data sources to improve services that underlie the success of some of the largest companies in the world. The success of RS, and their ubiquity, stems from their capability to enhance user retention and to seemingly help users find the relevant content for their profile. Moreover, it is already possible to extract information and patterns from both structured information (e.g. online shopping basket) and unstructured information (e.g., free text, images, and videos). As such, RS are improving faster and will offer more gains to content providers.

Naturally, the specifics of each system are largely dependent on their application context and goals. Take, for instance, Amazon which started using item-to-item approaches but currently leverages information from users’ past orders, profile, and activity, to offer different types of personalized recommendations, from targeted e-mails to shopping recommendations (Smith and Linden 2017). In 2018, the consultancy company McKinsey estimated that Amazon’s RS was responsible for 35% of its sales (MacKenzie et al. 2013). Netflix gained international fame among engineers and enthusiasts with the release, in 2006, of a dataset of 100 million users’ movie ratings and offered a 1Million USD reward to the team that could develop the best RS. Ten years later, Netflix RS was estimated to be worth up to 1 billion USD (McAlone 2016) and to drive 75% of users’ viewing choices (Vanderbilt 2013).

However, these large companies depend on using data freely and often willingly shared by their users, who might give away control of their privacy and decisions in exchange for convenience and productivity. In fact, we all know that our data is being used, but we may not know the extent to which this is happening or the problems it could pose (Englehardt and Narayanan 2016). First, and although these processes involve consent, terms of service are often unintelligible, sharing is not always voluntary, and might be a requirement to access content or services (Solomos et al. 2019; Urban et al. 2020). Second, and even when it is voluntary, it can have unexpected implications. In 2012, the New York Times reported a case in which the US-based chain Target generated predictions about the pregnancy of its customers, precisely by analyzing shopping profiles (Duhigg 2012). One such store was visited by a father, outraged that his teenage daughter had received promotions for baby products; later, when the store manager called to apologize, the man embarrassedly replied that his daughter was indeed pregnant: the supermarket chain knew before the family. Such anecdotical situations, corroborate our increasing reliance on such systems, which also makes us vulnerable to manipulation. For instance, nothing keeps online stores from showing more expensive products to people who did not previously compare prices online (Mikians et al. 2012). Indeed, different webservices commonly trade user information for marketing purposes, and it is common for a user that searches jeans on Amazon to be immediately targeted with jeans’ ads on FacebookFootnote 1 or Google.Footnote 2 Importantly, these “surveillance systems” are so prevalent and increasingly sophisticated that even when you use caution when publishing online, that caution itself can be informative (Zuboff 2019).

As mentioned, the traditional application of RS is to drive the consumption of content and products and, as such, it represents the most common development of such algorithms (similarity identification, reliance on proxies, prediction of future outcomes). However, they also find application in other types of algorithmic decision-making (e.g., credit score, or financial trading) and we will use them in a broader context to further discuss their current implications.

3.2 Implications for Democracy

As described, RS can be very useful to direct people to products that interest them, be it movies or diapers. But there is a thin line between informing and manipulating, and this is particularly relevant when the promoted “goods” are news or ideas. Social networks, such as Facebook or Instagram, have been long known to promote addictive attention, even if at the cost of spreading disinformation (Del Vicario et al. 2016; Vosoughi et al. 2018), creating echo chambers (Nikolov et al. 2015; Quattrociocchi et al. 2016), increasing polarization (Flaxman et al. 2016), and threatening the user’s mental health.Footnote 3 From researchers to data protection advocates, many have voiced concerns about the data that large platforms collect and how their recommendation systems can manipulate the information individuals are exposed to, be them prioritizing posts, or search engines displaying sponsored adds. In fact, Facebook offers any interested add-placer the possibility of selecting over tenths of individual characteristics, including (estimated) age or gender, level of education and in which subject, income-level, hobbies, political orientation (if from the US), travel profile, and even whether targets are away for the weekend with family or friends (Haidt and Twenge 2021).

The Cambridge Analytica case, in early 2018, brought to the public spotlight how, through refined individual profiling, political campaigns could influence the voting of target individuals or constituencies.Footnote 4 Political scientists have argued that the use of modern data science approaches to politics represented a significant shift from classical strategies: marketing techniques have been used in politics since at least the 1930s (O’Shaughnessy 1990), but the speed and increasing precision of AI tools means that political messages no longer need to be general and appeal to a broad constituency; instead, they can send highly personalized messages based on individual profiles, saying one thing to one demographic and the opposite to another, with very little scrutiny (Aldrich et al. 2015; Ribeiro et al. 2019; Silva et al. 2020). That some of these messages can include untrue information is even more worrisome. Naturally, the political use of misleading and even outright false information is nothing new, but the surge in online activity, coupled with poor digital literacy, and individual-level consumer profiling, has all the ingredients of a perfect storm. Disinformation spreading has found fertile ground on social networks, often through emotion manipulation, first shown to occur by Facebook itself (Kramer et al. 2014), and to work not only in the targeted individuals but also to contaminate their friends (Coviello et al. 2014). There is also increasing evidence that some individuals might be more susceptible to political disinformation than others (Pennycook and Rand 2021), with specific cognitive bias playing important roles.

In fact, personalized algorithms on search engines and social networks feeds might strengthen these already existing biases in at least three different ways: (1) as information is filtered based on past history (potentially magnifying availability biases, in which individuals tend to rate as more important things that they can more easily recall (Abbey 2018), and confirmatory tendencies, in which individuals seek or particularly trust in information that re-enforces or confirms their beliefs (Burtt 1939)); (2) as humans tend to associate with others similar to them and to favor people in one’s own group, over people identified as belonging to outgroups (ingroup bias) (Nelson 1989); and (3) with beliefs, biases, and even disinformation (McPherson et al. 2001) amplified and reinforced by this closed, homophilic communities, leading to the already mentioned echo chambers (Barberá et al. 2015; Flaxman et al. 2016; Quattrociocchi et al. 2016) and to increased online hostility and polarization (Yardi and Boyd 2010; Conover et al. 2021).

But political (mis)information is not the only kind to heavily impact on society and democracy. In April 2020, Facebook acknowledged that millions of its users saw false COVID19-related information on this platform (Ricard and Medeiros 2020); On Twitter, according to Yang et al. (2020), “the combined volume of tweets linking to low-credibility information was comparable to the volume of New York Times articles and CDC links”; by August 2021, YouTube had removed 1 million videos that included dangerous COVID-19 misinformation. Importantly, there is evidence that such misinformation impacted vaccination hesitancy (Loomba et al. 2021) and compliance with control measures (Roozenbeek et al. 2020), in line with the notion that misinformation often serves the goal of creating divisive content and leading to social unrest (Emmott 2020; Ricard and Medeiros 2020; Barnard et al. 2021; Silva and Benevenuto 2021). For much of 2020 and 2021, the world was fighting two pandemics in parallel: one caused by a virus, and another caused by fake news, supported by human bias and attention maximizing algorithms (Goncalves-Sa 2020).

Another very relevant risk comes from societal control. As described, politicians can use social networks and AI systems to target possible voters, but several leaders have also realized the much broader potential of AI, from improving public administration, to creating war robots. According to Vladimir Putin: “Artificial intelligence is the future, not only of Russia, but of all of mankind (…) Whoever becomes the leader in this sphere will become the ruler of the world.” (Allen 2017) China hopes to be the leader by 2030 (Department of International Cooperation Ministry of Science and Technology (MOST) 2017) and is designing and implementing a large-scale social experiment, which involves using RS to classifying citizens according to their social behavior, only possible thanks to AI-driven facial recognition technology (Liang et al. 2018). These models have been increasingly used around the world,Footnote 5 often with security purposes. In 2020, the Israeli and US armies used AI to track and assassinate an Iranian physicist (Bergman and Fassihi 2021).

All these examples describe situations in which the RS are worrying because they work as intended, be it to improve consumption or to target voters. In the next chapter, we will focus on situations in which they fail and how that can have consequences for individuals and societies.

4 When Recommendation Systems Fail

The described recommendation systems use fine-grained information to train AI models to target specific individuals. Typically, what these systems do is output probabilities of a certain event and aid in decision-making. Examples can range from algorithms that calculate a risk score for depression (Reece et al. 2017; Eichstaedt et al. 2018), try to identify the best candidate for a given position (Paparrizos et al. 2011), or that recommend a movie based on previous choices (Bennett and Lanning 2007). These algorithms are trained on large training datasets, of variable quality, and “learn” by trial and error, with subjective definitions of error (for example, what “best candidate” means, must be represented as a mathematical object, when often “best” cannot be easily quantified). This means that there is no real distinction between the model, the data used to train it, and the assumptions that the coder made: if the data or the target are biased, the model will be biased. These bias might appear at different steps and have different consequences, but it is important to realize that: (1) it is virtually impossible to have a complete dataset and all datasets are samples, biased by the sampling process; (2) there are human decisions involved in defining targets; (3) targets often rely on proxies, (4) the predictions might turn into self-fulfilling prophecies because they frequently impact outcomes and it is often very difficult to have external validation. Again, this might be of little importance in the case of a user who never gets to seem movie X because it is not suggested, but very serious in the case of someone who gets a credit request denied and, consequentially, defaults on another payment: the system might find confirmation that the credit refusal was the best decision when indeed, it was what caused the default.

Consequently, there should be no illusions of “model neutrality”. All models have problems, and acknowledging it is a fundamental and essential step to design mitigation strategies. In this section, we describe how biased data leads to biased algorithms, how biased algorithms can lead to discriminatory policies, and offer some examples from both the private and public sectors.

4.1 Learning from Biased Data: Implications for Individuals

As there is no perfect dataset, it is important to understand its limitations when training any algorithm with it. Let us think of a model to identify the best candidates to enter engineering school. One would start by collecting vast amounts of data, including grades, happiness scores, time to degree completion, previous education, future career, etc., on all students who have ever gone through a given university. This dataset would still have no information on how good the rejected candidates could have become (sampling bias) or on how many of them eventually suffered from burnout (limitations and subjectivity in feature selection): this means, that if the system systematically rejected promising candidates in the past, the algorithm is very likely to continue doing so in the future; and that if, for example, it values prizes over creating a safe work environment, it might pave the way for more accidents in the future. It is also easy to anticipate that such a dataset would be unbalanced in terms of gender and likely also age, ethnicity, nationality, and probability of wearing glasses, and so would the model predictions. In fact, several previous attempts at training such algorithms to select applicants, for schools or jobs, have led to discriminatory practices, stirring large discussions (O’Neil 2016). It is important to note that, very often, these algorithms are created not just for speeding up and automating processes, but also because we know that human-based systems are biased: the assumption is that models would be blind to color or gender and, thus, fairer. However, RS trained with biased data will generally be biased as well (Garcia 2016), and this is true even if models are trained on very large datasets. For example, the increasingly popular Chat GPT application was trained using 570 Gb of data, but most of this data was obtained through the internet, which is known to have an overrepresentation of some countries and age groups (Sheng et al. 2021).

An area in which such discrimination can have dire consequences is health. Kadambi (2021) have crucial sources of bias in medical devices, including computational bias, which happens when datasets used in clinical trials or when training algorithms to select candidates for such clinical trials, are biased. Historically, this has been the case for specific ethnical groups and women, often underrepresented in health datasets (and even in experimental protocols).

Biased datasets have also been shown to play important roles in classification and facial recognition (Buolamwini and Gebru 2018; Barlas et al. 2019). For example, Twitter dropped its picture cropping algorithm after suspicions of racial bias (Agrawal and Davis 2021) and both Flickr and Google algorithms tagged photos of black individuals as apes (Zhang 2015).

As disastrous as these examples are, it can be argued that they are the price to pay for the learning process: they are precautionary tales, reminding us that we are still at the infancy of machine decision-making and many other mistakes will be made before we can rely on algorithms. Unfortunately, and despite their current limitations, many are already being deployed, including in punitive environments, as described in the next section.

4.2 From Bad Algorithms to Discriminatory Policies

The individual consequences of a faulty Netflix algorithm are probably easy to minimize; models that select candidates for a given job can have much worse consequences, but nothing compares to when such algorithms are deployed in a large-scale punitive context. We already mentioned how the Chinese government is using facial recognition and other AI tools to evaluate citizens according to their behavior. If the system is faulty, the consequences for the individuals can be tremendous.

Another very debated example, that relies on proxies, is COMPAS, a proprietary algorithm that helps US judges set bails based on estimated risk scores of future offenses. In 2016, COMPAS was analyzed by ProPublica (Larson et al. 2016) and revealed to discriminate individuals based on their race: for similar offenses and crimes, black defendants were more likely to be given higher risk scores. Importantly, the datasets that were used to train the model did not include information on race: the model was possibly using zip code as a proxy for risk, thus picking it as a proxy the correlation between the ethnicity and economic status in US society (this analysis is disputed by the owning company and the extent of the discrimination is still being debated (Spielkamp 2017)).

Despite so many notable failures, governments around the world have been sponsoring the development of algorithms for use in public administration, in a variety of areas. These algorithms are often proprietary and function as a blackbox: not even the government officials know how they work and what justifies their recommendations, or risk scores. This leaves very little room for people to complain or even understand their “evaluation”, raising fundamental legal questions. The Dutch government used such an algorithm, SyRI, from 2014 to 2020, when the Court of the Hague halted its use (Amnesty International 2021). It aimed at identifying social welfare fraud, was trained on large governmental datasets, and included information on virtually all inhabitants of The Netherlands. It would generate risks-scores and, if these were high, trigger an investigation. However, it was shown that the algorithm disproportionately and unfairly targeted poor and minority communities (Xenophobic Machines 2021), with consequences so dire that it led to the resignation of the Dutch government.

Such faulty algorithms have been increasingly revealed (Bandy 2021) but, naturally, it can be argued that they only reveal past and pre-existing bias, hidden in the data, and that human decision-making is equally discriminatory. While the first contention is very likely true, it still raises the important question of whether it is acceptable to perpetuate such discriminatory practices under an illusion of mathematical neutrality. The second is more interesting, as it is difficult to quantify whether humans or current algorithms are more discriminatory (Dressel and Farid 2018), but at least in the case of the examples described here, there are at least two good arguments in favor of the later. One is technical, as AI models identify dominant patterns and are more likely to exclude relevant outsiders (for example, the brilliant candidate from a very poor, black neighborhood). The other is scale, as human panels might have their own biases, but these might be different from panel to panel and there are human limits to how many applicants a panel can see; obviously, these limits and natural variation do not necessarily apply to machine decision-making (O’Neil 2016). A third, less studied possibility, is that algorithms, including commercial-like cookies, might be used to mask deliberate targeting of individuals by state actors, as in the case of the identification of minorities (Borgesius 2018). Therefore, there are serious concerns that algorithms trained on biased datasets will not only make biased decisions, they will also amplify existing societal discrimination and unfairness.

5 A Way Forward

There is much room for improvement of current and future RS (and AI in general), and we propose six steps, summarized in a simple mnemonic: (ATI) (Fayyaz et al. 2020). The first is recognizing that they are not neutral and can be very prone to bias. This Acceptance should be obvious, but it is still disputed by several in the field, typically contesting that they (a) are not biased as algorithms are blind to individuals, (b) are not more biased than non-algorithmic systems, or (c) that this a problem for social scientists and that engineers and programmers should not be concerned with such issues. In fact, most Data Science and Artificial Intelligence graduate programs still do not include Ethics or even Algorithmic Fairness courses, effectively training generations of students to ignore fundamental problems with datasets, algorithms and, consequentially, recommendation and decision systems they design and often implement. Such content should be compulsory in all formal AI education, taking us to the second step—Training.

Another fundamental issue is lack of Inclusion and Diversity. This is observed not only on the training datasets, as already discussed, but also in the coding teams. In “Racist in the Machine” (Reece et al. 2017), Megan Garcia describes some grave consequences of design blind spots and gives the example of four smartphone personal assistants (Siri, Google Now, Cortana, and S Voice), increasingly used for help in health and emergency situations, that could not recognize “I am being abused” or “I was beaten up by my husband”. ML teams should be diverse and bring together people that work on different disciplines and that can contribute to both the technical and social components of algorithm design. Moreover, that algorithms try to find similarities can lead to polarization and homophily, but also to uniformization. As Thomas Homer-Dixon put it, “a simplified, uniform global culture will inevitably have less diversity of ideas and ingenuity that can help us cope with the great challenges we’re facing” (Homer-Dixon 2001). Diversity should be a value at many different levels.

RS pipelines should also include streamlined Data Auditing and Debiasing: accepting it as an integral part of data processing pipelines, recognizes its importance for fair and effective AI, while reducing dataset bias. One of the first such efforts was developed by Pedro Saleiro and Rayid Ghani, through a data auditing algorithm, Aequitas, that inspects datasets for different types of demographic unbalances, including age, gender, and race (Saleiro et al. 2018). In all three datasets analyzed, they found important bias that affects the models results. These are excellent first efforts, but it is important to note that (a) we can only audit data in a very limited number of instances, and (b) that debiasing is even more challenging. For example, we can check gender-classified datasets for unbalances in gender (as in the hiring example described above), but this might be impossible to do in fully anonymized datasets or in datasets that simply do not include possibly relevant data as is often the case for ethnicity or physical disabilities. Even more critical, we cannot identify biases that we do not know we have, as a society: it might be the case that people with glasses are perceived as more competent for some jobs; as we are unaware of it, we would not include “having glasses” as a label and, even if we did, we would most likely not audit our algorithm for possible discrimination. But let us assume that complete auditing was possible and that all possible discriminatory imbalances in our dataset had been identified: we would still have important decisions to make regarding how to de-bias them. Continuing with the college admission example, it should be possible to understand that the model is being trained on a gender unbalanced dataset and that correcting for it would now lead to more women candidates being selected. But how big should the correction be? Should it reflect past ratios of engineering school admissions, perpetuating existing imbalances or should it aim for the same ratios observed in the population, effectively imposing a 50% gender quota? These and other example illustrate how many of these decisions can and are effectively being made, often implicitly, and how ill-informed attempts to correct bias might generate new forms of unfairness.

These decisions are fundamentally moral, helping to create a society by design. Thus, the final step should be Transparency. As Rhema Vaithianathan put it, “If you can’t be right, be honest” (Courtland 2018). Blackbox algorithms, in which the process and features used to reach a decision are unknown or proprietary, should be avoided. However, they are increasingly used for two main reasons: first, it can be argued that if the decision process is known, individuals and companies could abuse and even rig the system on their behalf; second, the more complex the algorithms, as is the case with deep learning, the more difficult it is to understand the decision-making process. Therefore, it has been argued that such algorithms should only be used in positive environments and when they significant outperform traditional processes (e.g. for medical diagnosis), and never in punitive contexts (such as in the COMPAS and SyRI examples). In any case, individuals should always have the right to access, verify, correct errors, and appeal from algorithmic decisions. As these processes are often very complex, this generates an important tension, extensively noticed by the thinkers of the so-called “Risk Society” (Beck 1992), in which technical expertise is fundamental to design and control such systems, but this control should be put into effect by the, often lay, society. Therefore, the ones who understand the problems should also accept their political and social responsibility and engage in active Interaction with communities and decision-makers.

6 Conclusions

It should be increasingly obvious that using machine-based decisions is far from neutral, and that its problems have important societal implications. In this chapter, we summarized some limitations of recommendations systems, from both technical and conceptual perspectives, and offered examples of its past, ongoing, and possible future negative impacts. Overall, we argue that these risks should be understood by the general population, and we offer specific guidelines for improving RS and societal oversight.Footnote 6