1 Introduction

We interact with recommender (or recommendation) systems (RS) on a regular basis, when we use digital services and apps, from Amazon to Netflix and news aggregators. They are algorithms that make suggestions about what a user may like, such as a specific movie. Slightly more formally, they are functions that take information about a user’s preferences (e.g. about movies) as an input, and output a prediction about the rating that a user would give of the items under evaluation (e.g., new movies available), and predict how they would rank a set of items individually or as a bundle. We shall say more about the nature of recommender systems in the following pages, but even this general description suffices to clarify that, to work effectively and efficiently, recommender systems collect, curate, and act upon vast amounts of personal data. Inevitably, they end up shaping individual experience of digital environments and social interactions (Burr et al. 2018; de Vries 2010; Karimi et al. 2018).

RS are ubiquitous and there is already much technical research about how to develop ever more efficient systems (Adomavicius and Tuzhilin 2005; Jannach and Adomavicius 2016; Ricci et al. 2015). In the past 20 years, RS have been developed focusing mostly on business applications. Even if researchers often adopt a user-centred approach focusing on preference prediction, it is evident that the applications of RS have been driven by online commerce and services, where the emphasis has tended to be on commercial objectives. But RS have a wider impact on users and on society more broadly. After all, they shape user preferences and guide choices, both individually and socially. This impact is significant and deserves ethical scrutiny, not least because RS can also be deployed in contexts that are morally loaded, such as health care, lifestyle, insurance, and the labour market. Clearly, whatever the ethical issues may be, they need to be understood and addressed by evaluating the design, deployment and use of the recommender systems, and the trade-offs between the different interests at stake. A failure to do so may lead to opportunity costs as well as problems that could otherwise be mitigated or avoided altogether, and, in turn, to public distrust and backlash against the use of RS in general (Koene et al. 2015).

Research into the ethical issues posed by RS is still in its infancy. The debate is also fragmented across different scientific communities, as it tends to focus on specific aspects and applications of these systems in a variety of contexts. The current fragmentation of the debate may be due to two main factors: the relative newness of the technology, which took off with the spread of internet-based services and the introduction of collaborative filtering techniques in the 1990s (Adomavicius and Tuzhilin 2005; Pennock et al. 2000); and the proprietary and privacy issues involved in the design and deployment of this class of algorithms. The details of RS currently in operation are treated as highly guarded industrial secrets. This makes it difficult for independent researchers to access information about their internal operations, and hence provide any evidence-based assessment. In the same vein, due to privacy concerns, providers of recommendation systems may be reluctant to share information that could compromise their users’ personal data (Paraschakis 2018).

Against this background, this article addresses both problems (infancy and fragmentation), by providing a survey of the current state of the literature, and by proposing an overarching framework to situate the contributions to the debate. The overall goal is to reconstruct the whole debate, understand its main issues, and hence offer a starting point for better ways of designing RS and regulating their use.

2 A working definition of recommender systems

The task of a recommendation system—i.e. what we shall call the recommendation problem—is often summarised as that of finding good items (Jannach and Adomavicius 2016). This description is common and popular among practitioners, especially in the context of e-commerce applications. However, it is too broad and not very helpful for research purposes. To make it operational, one needs to specify, among other things, three parameters:

  1. (a)

    what the space of options is;

  2. (b)

    what counts as a good recommendation; and, importantly

  3. (c)

    how the RS’s performance can be evaluated.

Specifying these parameter choices is highly dependent on the domain of application and the level of abstraction [LoAs, see (Floridi 2016)]Footnote 1 from which the problem is considered (Jannach et al. 2012). Typically, the literature implements three LoAs: catalogue-based, decision support, and multi-stakeholder environment. Let us consider each of these in turn.

In e-commerce applications, the space of options (that is, the observables selected by the LoA) may be the items in the catalogue, while a good recommendation may be specified as one which ultimately results in a purchase. To evaluate the system performance, one may compare the RS’s predictions to the actual user behaviour after a recommendation is made. In the domain of news recommendations, a good recommendation may be defined as a news item that is relevant to the user (Floridi 2008), and one may use click-through rates as a proxy to evaluate the accuracy of the system’s recommendations. Similar RS are designed to develop a model of individual users and to use it to predict the users’ feedback on the system’s recommendation, which is essentially a prediction problem.

Taking a different LoA, RS may also be considered to provide decision support to their users. For example, an online booking RS may be designed to facilitate the user’s choice of hotel options. In this case, defining what counts as a good recommendation is more complex, because it involves appreciation of the user’s goals and decision-making abilities. Evaluating the system’s performance as a decision support requires more elaborate metrics. For example, Jameson et al. (2015) consider six strategies for generating recommendations, which track different choice patterns based on either of the following features: (1) the attributes of the options; (2) the expected consequences of choosing an option; (3) prior experience with similar options; (4) social pressure or social information about the options; (5) following a specific policy; (6) trial-and-error-based choice.

More recently, Abdollahpouri et al. (2017) have proposed a different kind of LoA (our terminology), defining RS in terms of multi-stakeholder environments (what we would call the LoA’s observables), where multiple parties (including users, providers, and system administrators) can derive different utilities from recommendations. Epistemologically, this approach is helpful because it enables one to conceptualise explicitly the impact that RS have at different levels, both on the individual users interacting with them, and on society more broadly, making it possible to articulate what ethical trade-offs could be made between these different, possibly competing interests. Figure 1 presents a diagram of the stakeholders in a RS.

Fig. 1
figure 1

Stakeholders in a RS

In view of the previous LoAs, and for the purposes of this article, we take recommender systems to be a class of algorithms that address the recommendation problem using a content-based or collaborative filtering approach, or a combination thereof. The technical landscape on RS is of course more vast. However, our choice is primarily motivated by our finding that the existing literature on the ethical aspects of RS focuses on these types of systems. Furthermore, the choice offers three advantages. It is compatible with the most common LoAs we have listed above. By focusing on the algorithmic nature of recommender systems, it also singles out one of the fastest growing areas of research and applications for machine learning. And it enables us to narrow down the scope of the study, as we shall not consider systems that approach the recommendation problem using different techniques, such as, for instance, knowledge-based systems. With these advantages in mind, in the next section, we propose a general taxonomy to identify the ethical challenges of RS. In Sect. 4, we review the current literature, structured around six areas of concern. We conclude in Sect. 5 by mapping the discussion onto our ethical taxonomy and indicating the direction of our further work in the area.

3 How to map the ethical challenges posed by recommender systems

To identify what is ethically at stake in the design and deployment of RS, let us start with a formal taxonomy. This is how we propose to design it.

The question about which moral principles may be correct is deeply contentious and debated in philosophy. Fortunately, in this article, we do not have to take a side because all we need is a distinction about which there is a general consensus; there are at least two classes of variables that are morally relevant, actions and consequences. Of course, other things could also be morally relevant, in particular intentions. However, for our purposes, the aforementioned distinction is all we need, so we shall assume that a recommender system’s behaviour and impact will suffice to provide a clear understanding of what is ethically at stake.

The value of some consequences is often measured in terms of the utility they contain. So, it is reasonable to assume that any aspect of a RS that could impact negatively the utility of any of its stakeholders, or risk imposing such negative impacts, constitutes a feature that is ethically relevant.

While the concept of utility can be made operational using quantifiable metrics, rights are usually taken to provide qualitative constraints on actions. Thinking in terms of actions and consequences, we can identify two ways in which a recommender system can have ethical impacts. First, its operations can

  1. (a)

    impact (negatively) the utility of any of its stakeholders; and/or

  2. (b)

    violate their rights.

Second, these two kinds of ethical impact may be immediate—for example, a recommendation may be inaccurate, leading to a decrease in utility for the user—or they may expose the relevant parties to future risks. The ethics of risk imposition is the subject of a growing philosophical literature, which highlights how most activities involve imposition of risks (Hansson 2010; Hayenhjelm and Wolff 2012). In the case of RS, for example, the risks may involve exposing users to undue privacy violations by external actors, or the exposure to potentially irrelevant or damaging content. Exposure to risks of these sorts can constitute a wrong, even if no adverse consequences actually materialise.Footnote 2

Given the previous analysis, we may now categorise the ethical issues caused by recommender systems along two dimensions (see Table 1):

Table 1 Taxonomy of ethical issues of recommender systems
  1. 1.

    whether a (given feature of a) RS negatively impacts the utility of some of its stakeholders or, instead, constitutes a rights violation, which is not necessarily measured in terms of utility; and

  2. 2.

    whether the negative impact constitutes an immediate harm or it exposes the relevant party to future risk of harm or rights violation.

Table 1 summarises our proposed taxonomy, including some examples of different types of ethical impacts of recommender systems, to be discussed in Sect. 5.

With the help of this taxonomy we are now ready to review the contributions provided by the current literature. We shall offer a general discussion of our findings in the conclusion.

4 The ethical challenges of recommender systems

The literature addressing the ethical challenges posed by RS is sparse, with the discussion of specific issues often linked to a specific instance of a RS, and is fragmented across disciplinary divides. Through a multidisciplinary, comparative meta-analysis, we identified six main areas of ethical concerns (see “Appendix” for our methodology). They often overlap but, for the sake of clarity, we shall analyse them separately in the rest of this section.

4.1 Inappropriate content

Only a handful of studies to date address explicitly the ethics of RS as a specific issue in itself. Earlier work on the question of ethical recommendations focuses more on the content of the recommendations, and proposes ways to filter the items recommended by the system on the basis of cultural and ethical preferences. Four studies are particularly relevant. Souali et al. ( 2011) consider the issue of RSs that are not culturally appropriate, and propose an “ethical database”, constructed on the basis of what are taken to be a region’s generally accepted cultural norms, which act as a filter for the recommendations. Tang and Winoto (2016) take a more dynamic approach to the issue, proposing a two-layer RS, comprising a user-adjustable “ethical filter” that screens the items that can be recommended based on the user’s specified ethical preferences. Rodriguez and Watkins (2009) adopt a more abstract approach to the problem of ethical recommendations, proposing a vision for a eudaimonic RS, whose purpose is to “produce societies in which the individuals experience satisfaction through a deep engagement in the world”. This, the authors predict, could be made achievable through the use of interlinked big data structures.

Finally, Paraschakis (2016, 2017, 2018) provides one of the most detailed accounts. Focusing on e-commerce applications, Paraschakis suggests that there are five ethically problematic areas:

  • the practices of user profiling,

  • data publishing,

  • algorithm design,

  • user interface design, and

  • online experimentation or A/B testing, i.e. the practice of exposing selected groups of users to modifications of the algorithm, with the aim of gathering feedback on the effectiveness of each version from the user responses.

The risks he identifies relate to breaches of a user’s privacy (e.g. via data leaks, or by data gathering in the absence of explicit consent), anonymity breaches, behaviour manipulation and bias in the recommendations given to the user, content censorship, exposure to side effects, and unequal treatment in A/B testing with a lack of user awareness, leading to a lack of trust. The solutions put forward in Paraschakis (2017) revolve around a user-centred design approach (more on this in the next paragraph), introducing adjustable tools for users to control explicitly the way in which RS use their personal data, to filter out marketing biases or content censorship, and to opt out of online experiments.

With the exception of Souali et al. (2011), who adopt a recommendation filter based on geographically located cultural norms, the solutions described in this section rely on a user-centred approach. Recalling our taxonomy, they try to minimise the negative impact on the user’s utility—in particular, unwanted exposure to testing, and inaccurate recommendations—and on the user’s rights, in particular, recommendations that do not agree with the user’s values, or expose them to privacy violations. However, user-centred solutions have significant shortcomings: they may not transfer to other domains, they may be insufficient to protect the user’s privacy, and they may result in inefficiency, for example, impairing the system’s effectiveness in generating new recommendations, if enough users choose to opt out of profile tracking or online testing. Moreover, users’ choice of parameters can reveal sensitive information about the users themselves. For example, adding a filter to exclude some kind of content gives away the information that the user may find this content distressing, irrelevant, or in other ways unacceptable. But above all, the main problem is that, although user-centred solutions may foster the transparency of recommender systems, they also shift the responsibility and accountability for the protection of rights and utility to the users. These points highlight how user-centred solutions in general are challenged by their demanding nature, as they may constitute a mere shift in responsibility when the users are only nominally empowered but actually unable to manage all the procedures needed to protect their interests. This may, therefore, be an unfair shift since it places undue burdens on the users, and is in any case problematic because the effectiveness of these solutions varies with the level of awareness and expertise of the users themselves, which may lead to users experiencing different levels of protection depending on their ability to control the technology.Footnote 3

Implementing an “ethical filter” for a recommender system, as suggested by Rodriguez and Watkins (2009), would also be controversial in some applications, for example, if it were used by a government to limit citizens’ ability to access some politically sensitive contents. As for the eudaimonic approach, this goes in the direction of designing a recommender system that is an optimal decision support, yet it seems practically unfeasible, and at least much more research would be needed. Figuring out what is a “good human life” is something that millennia of reflection have not yet solved.

4.2 Privacy

User privacy is one of the primary challenges for recommendation systems (Friedman et al. 2015; Koene et al. 2015; Paraschakis 2018). This may be seen as inevitable, given that a majority of the most commercially successful recommender systems are based on hybrid or collaborative filtering techniques, and work by constructing models of their users to generate personalised recommendations. Privacy risks occur in at least four stages. First, they can arise when data are collected or shared without the user’s explicit consent. Second, once data sets are stored, there is the further risk that they may be leaked to external agents, or become subject to de-anonymization attempts (Narayanan 2008). At both stages, privacy breaches expose users to risks, which may result in loss of utility (for example, if individual users are targeted by malicious agents as a result), or in rights violations (for example, if users’ private information is utilised in ways that threaten their individual autonomy, see Sect. 4.3). Third, and independently of how securely data are collected and stored, privacy concerns also arise at the stage of inferences that the system can (enable one to) draw from the data. Users may not be aware of the nature of these inferences, and they may object to this use of their personal data if they were better informed. Privacy risks do not only concern data collection because, for example, an external agent observing the recommendation that the system generates for a given user may be able to infer some sensitive information about the user (Friedman et al. 2015). Extending the notion of informed consent to the indirect inferences from user recommendations appears difficult.Footnote 4 Finally, there is also another subtle, but important, systemic issue regarding privacy, which arises at the stage of collaborative filtering: the system can construct a model of the user based on the data it has gathered on other users’ interactions. In other words, as long as enough users interact and share their data with the system, the system may be able to construct a fairly accurate profile even for those users about whom it has fewer data. This indicates that it may not be feasible for individual users to be shielded completely from the kinds of inferences that the system may be able to draw about them. It could be a positive feature in some domains, like medical research, but it may also turn out to be problematic in other domains, like recruitment or finance.

Current solutions to the privacy challenges intrinsic to recommender systems (especially those based on collaborative filtering techniques) fall into three broad categories, covering architectures, algorithmic, and policy approaches (Friedman et al. 2015). Privacy-enhancing architectures aim to mitigate privacy risks by storing user data in separate and decentralised databases, to minimise the risk of leaks. Algorithmic solutions focus on using encryption to minimise the risk that user data could be exploited by external agents for unwarranted purposes. Policy approaches, including GDPR legislation, introduce explicit guidelines and sanctions to regulate data collection, use, and storage.

The user-centred recommendation framework proposed by Paraschakis (2017), which we already encountered in the previous section, also introduces explicit privacy controls, letting the users decide whether their data can be shared, and with whom. However, as we have already remarked, user-centred approaches have limits, as they may constitute a mere shift in responsibility, placing an undue burden on the users. A possible issue that may arise specifically with user-enabled privacy controls is that the user’s privacy preferences would, in themselves, constitute informative metadata, which the system (or external observers) could use to make sensitive inferences about the user, for example, to infer that a user who has strong privacy settings may have certain psychological traits, or that they may have “something to hide”. When considering systemic inferences, due to the nature of collaborative filtering methods, even if user-centred adjustments could be implemented across the board in effective ways, they would arguably still not solve the problem.

Crucially, due to the nature of recommender systems—which, as we have seen, rely on user models to generate personalised recommendations—any approach to the issue of user privacy will need to take into account not only the likely trade-off between privacy and accuracy, but also fairness and explainability of algorithms (Friedman et al. 2015; Koene et al. 2015). For this reason, ethical analyses of recommender systems are better developed by embracing a macro-ethical approach. This is an approach that is able to consider specifically ethical problems related to data, algorithms, and practices, but also how the problems relate, depend on, and impact each other (Floridi and Taddeo 2016).

4.3 Autonomy and personal identity

Recommender systems can encroach on individual users’ autonomy, by providing recommendations that nudge users in a particular direction, by attempting to “addict” them to some types of contents, or by limiting the range of options to which they are exposed (Burr et al. 2018; de Vries 2010; Koene et al. 2015; Taddeo and Floridi 2018). These interventions can range from being benign (enabling individual agency and supporting better decision-making by filtering out irrelevant options), to being questionable (persuasion, nudging), and possibly malign [being manipulative and coercive (Burr et al. 2018)].

Algorithmic classification used to construct user models on the basis of aggregate user data can reproduce social categories. This may introduce bias in the recommendations. We shall discuss this risk in detail in Sect. 4.4. Here, the focus is on a distinctive set of issues arising when the algorithmic categorization of users does not follow recognisable social categories. de Vries (2010) powerfully articulates the idea that our experience of personal identity is mediated by the categories to which we are assigned. Algorithmic profiling, performed by recommender systems, can disrupt this individual experience of personal identity, for at least two main reasons. First, the recommender system’s model of each user is continuously reconfigured on the basis of the feedback provided by other users’ interactions with the system. In this sense, the system should not be conceptualised as tracking a pre-established user identity and tailoring its recommendations to it, but rather as contributing to the construction of the user identity dynamically (Floridi 2011). Second, the labelling that the system uses to categorise users may not correspond to recognisable attributes or social categories with which the user would self-identify (for example, because machine-generated categories may not correspond to any known social representation), so even if users could access the content of the model, they would not be able to interpret it and connect it with their lived experiences in a meaningful way. For example, the category ‘dog owner’ may be recognisable as significant to a user, while ‘bought a novelty sweater’ would be less socially significant; yet the RS may still regard it as statistically significant when making inferences about the preferences of the user. These features of recommender systems create an environment where personalization comes at the cost of removing the user from the social categories that help mediate their experiences of identity.

In this context, an interesting take on the issue of personal autonomy in relation to recommender systems comes from the “captology” of recommender systems. Seaver (2018a) develops this concept from an anthropological perspective:

[a]s recommender[s] spread across online cultural infrastructures and become practically inescapable, thinking with traps offers an alternative to common ethical framings that oppose tropes of freedom and coercion (Seaver, 2018a).

Recommender systems appear to function as “sticky traps” (our terminology) insofar as they are trying to “glue” their users to some specific solutions. This is reflected in what Seaver calls “captivation metrics” (i.e. that measure user retention), which are commonly used by popular recommender systems. A prominent example is YouTube’s recommendation algorithm, which received much attention recently for its tendency to promote biased content and “fake news”, in a bid to keep users engaged with its platform (Chaslot 2018). Regarding recommender systems as traps requires engaging with the minds of the users; traps can only be effective if their creators understand and work with the target’s world view and motivations, so the autonomous agency of the target is not negated, but effectively exploited. Given this captological approach, and given the effectiveness and ubiquity of the traps of recommender systems, the question to ask is not how users can escape from them, but rather how users can make the traps work for them.

4.4 Opacity

In theory, explaining how personalised recommendations are generated for individual users could help to mitigate the risk of encroaching on their autonomy, giving them access to the reasons why the system “thinks” that some options are relevant to them. It would also help increase the transparency of the algorithmic decisions concerning how to class and model users, thus helping to guard against bias.

Designing and evaluating explanations for recommender systems can take different forms, depending on the specific applications. As reported by Tintarev and Masthoff (2011), several studies have pursued a user-centred approach to evaluation metrics, including metrics to evaluate explanations of recommendations. What counts as a good explanation depends on several criteria: the purpose of the recommendation for the user; whether the explanation accurately matches the mechanism by which the recommendation is generated; whether it improves the system’s transparency and scrutability; and whether it helps the user to make decisions more efficiently (e.g. more quickly), and more effectively, e.g. in terms of increased satisfaction.

These criteria are satisfied by factual explanations.Footnote 5 However, factual explanations are notoriously difficult to achieve. As noted by Herlocker et al. (2000), recommendations generated by collaborative filtering techniques can, on a simple level, be conceptualised as analogous to “word of mouth” recommendations among users. However, offline word of mouth recommendations can work on the basis of trust and shared personal experience, whereas in the case of recommender systems users do not have access to the identity of the other users, nor do they have access to the models that the system uses to generate the recommendations. As we mentioned, this is an issue in so far as it diminishes the user’s autonomy. It may be difficult to provide good factual explanations in practice also for computational reasons (the required computation to generate a good explanation may be too complex), and because they may have distorting effects on the accuracy of the recommendations Tintarev and Masthoff (2011). For example, explaining to a user that a certain item is recommended because it is the most popular with other users may increase the item’s desirability, thus generating a self-reinforcing pattern where the item will be recommended more often because it is popular. This, in turn, reinforces its popularity, ending in a winner-takes-all scenario that, depending on the intended domain of application, can have negative effects on the variety of options, plurality of choices, and the emergence of competition (Germano et al. 2019). Arguably, this may be one of the reasons why Amazon does not privilege automatically products with less than perfect scoring but that have been rated by a large number of reviewers.

4.5 Fairness

Fairness in algorithmic decision-making is a wide-ranging issue, made more complicated by the existence of multiple notions of fairness, which are not all mutually compatible (Friedler et al. 2016). In the context of recommender systems, several articles identified in this review address the issue of recommendations that may reproduce social biases. They may be synthesised around two approaches.

On the one hand, Yao and Huang (2017) consider several possible sources for unfairness in collaborative filtering, and introduce four new metrics to address them by measuring the distance between recommendations made by the system to different groups of users. Focusing on collaborative filtering techniques, they note that these methods assume that the missing ratings (i.e., the ones that the system needs to infer from the statistical data to predict a user’s preferences) are randomly distributed. However, this assumption of randomness introduces a potential source of bias in the system’s predictions, because it is well documented that users’ underlying preferences often differ from the sampled ratings, since the latter are affected by social factors, which may be biased (Marlin et al. 2007). Following Yao and Huang (2017), Farnadi et al. (2018) also identify the two primary sources of bias in recommender systems with two problematic patterns of data collection, namely observation bias, which results from feedback loops generated by the system’s recommendations to specific groups of users, and population imbalance, where the data available to the system reflect existing social patterns expressing bias towards some groups. They propose a probabilistic programming approach to mitigate the system’s bias against protected social groups.

On the other hand, Burke (2017) suggests to consider fairness in recommendation systems as a multi-sided concept. Based on this approach, he focuses on three notions of fair recommendations, taking the perspective of either the user/consumer (C-fairness), whose interest is to receive the most relevant recommendations; or the provider (P-fairness), whose interest is for their own products or services to be recommended to potentially interested users; or finally a combination of the two (CP-fairness). This taxonomy enables the developer of a recommendation system to identify how the competing interests of different parties are affected by the system’s recommendations, and hence design system architectures that can mediate effectively between these interests.

In both approaches, the issue of fairness is tied up with choosing the right LoA for a specific application of a recommender system. Given that the concept of fairness is strongly tied to the social context within which the system gathers its data and makes recommendations, extending the same approach to any application of recommender systems may not be viable.

4.6 Social effects

A much-discussed effect of some recommender systems is their transformative impact on society. In particular, news recommender systems and social media filters, by nature of their design, run the risk of insulating users from exposure to different viewpoints, creating self-reinforcing biases and “filter bubbles” that are damaging to the normal functioning of public debate, group deliberation, and democratic institutions more generally (Bozdag 2013; Bozdag and van den Hoven, 2015; Harambam et al. 2018; Helberger et al. 2016; Koene et al. 2015; Reviglio 2017; Zook et al. 2017). This feature of recommender systems can have negative effects on social utility. A relatively recent but worrying example is the spread of propaganda against vaccines, which has been linked to a decrease in herd immunity (Burki 2019).

A closely related issue is protecting these systems from manipulation by (sometimes even small but) especially active groups of users, whose interactions with the system can generate intense positive feedback, driving up the system’s rate of recommendations for specific items (Chakraborty et al. 2019). News recommendation systems, streaming platforms, and social networks can become an arena for targeted political propaganda, as demonstrated by the recent Cambridge Analytical scandal in 2018, and the documented external interference in US political elections in recent years (Howard et al. 2019).

The literature on the topic proposes a range of approaches to increase the diversity of recommendations. A point noted by several authors is that news recommendation systems, in particular, must reach a trade-off between the expected relevance to the user and diversity when generating personalised recommendations based on pre-specified user preferences or behavioural data (Helberger et al. 2016; Reviglio 2017). In this respect, Bozdag and van den Hoven (2015) argue that the design of algorithmic tools to combat informational segregation should be more sensitive to the democratic norms that are implicitly built into these tools.

In general, the approaches to the issue of polarization and social manipulability appear to be split between bottom-up and top-down strategies, prioritising either the preferences of users (and their autonomy in deciding how to configure the personalised recommendations) or the social preference for a balanced public arena. Once again, some authors take a decidedly user-centred perspective. For example, Harambam et al. (2018) propose the use of different “recommendation personae”, or “pre-configured and anthropomorphised types of recommendation algorithms” expressing different user preferences with respect to novelty, diversity, relevance, and other attributes of a recommendation algorithm. In the same vein, Reviglio (2017) stresses the importance of promoting serendipity even at the cost of sacrificing aspects of the user experience, such as diminished relevance of the recommendations.

5 Discussion

Based on the review of the literature presented in the previous section, we can now revisit the taxonomy that we proposed in Sect. 3, and place the concerns that we have identified within the conceptual space that it provides. Table 2 summarises our results.

Table 2 Summary of identified ethical issues of recommender systems

Starting with privacy, the main challenge that is linked with privacy violations is the possibility of unfair or otherwise malicious uses of personal data to target individual users. Thus, from our review, it emerges that privacy concerns may be best conceptualised as exposure to risk. Moreover, the types of risk to which privacy violations expose users fall mainly under the category of rights violations, such as unfair targeting and use of manipulative techniques.

Issues of personal autonomy and identity also fall under the category of rights violations, and constitute cases of immediate violations. Unfair recommendations can be associated with a negative impact on utility but, as also noted by Yao and Huang (2017), fairness and utility are mutually independent, and unfairness may be best classified as a type of immediate right violation. Table 3 summarises the findings of this paper in terms of the current issues and the possible solutions that we have identified in the literature.

Table 3 Ethical issues of RS and possible solutions

A notable insight that emerges from the review is that most of the ethical impacts of recommender systems identified in the literature are analysed from the perspective of the receivers of the recommendations. This is evident not only in the reliance on accuracy metrics measuring the distance between user preferences and recommendations, but also when considering that privacy, unfairness, opacity, and the appropriateness of content are judged from the perspective of the individual receiving the recommendations. However, individual users are not the only stakeholders of recommender systems (Burke 2017). The utility, rights, and risks carried by providers of recommender systems, and by society at large, should also be addressed explicitly in the design and operation of recommender systems. And there are also more complex, nested cases in which recommendations concern third parties (e.g., what to buy for a friend’s birthday). Currently, this is (partially) evident only in the case of discussion on social polarization and its effects on democratic institutions (reviewed in Sect. 4.6). Failure to address explicitly these additional perspectives of the ethical impact of recommender systems may lead to masking seriously problematic practices. A case in point may be that of introducing a “bias” in favour of recommending unpopular items to maximise catalogue coverage in e-commerce applications (Jameson et al. 2015). This practice meets a specific need of the provider of a recommendation system, helping to minimise the number of unsold items, which in this specific instance may be considered a legitimate interest to be traded off against the utility that a user may receive from a more accurate recommendation. However, modelling the provider’s interests as a bias added to the system is unhelpful if the aim is to identify what would be the right level of trade-off between the provider’s and users’ interests.

Any recommendation is a nudging, and any nudging embeds values. The opacity about which and whose values are at stake in recommender systems hinders the possibility of designing better systems that can also promote socially preferable outcomes and improve the balance between individual and non-individual utilities.

The distribution of the topics by discipline also reveals some interesting insights (summarised in Table 4). Among the reviewed articles, the ones addressing privacy, fairness and opacity come predominantly from computer science. This is in line with the general trends in the field of algorithmic approaches to decision-making, and the presence of established metrics and technical approaches to address these challenges.

Table 4 Number of reviewed papers addressing each of the six concerns by discipline

In contrast, the challenges posed by socially transformative effects, manipulability, and personal autonomy are more difficult to address using purely technical approaches, largely because their definitions are qualitative, more contentious, and require viewing recommender systems in the light of the social context in which they operate. Thus, the articles identified in this review that relate to these issues are much more likely to come from philosophy, anthropology, and science and technology studies. The methodologies that they adopt are more varied, ranging from ethnographic study (Seaver 2018b), to hermeneutics (de Vries 2010), decision theory (Burr et al. 2018), and economics (Abdollahpouri et al. 2017).

6 Conclusion

This article offers a map and an analysis of the main ethical challenges posed by recommender systems, as identified in the current literature. It also highlights a gap in the relevant literature, insofar as it stresses the need to consider the interests of providers of recommender systems, and of society at large (including third-party, nested cases of recommendations), and not only of the receivers of the recommendation, when assessing the ethical impact of recommender systems. The next steps are, therefore, filling the gap, and articulating a comprehensive framework for addressing the ethical challenges posed by recommender systems, based on the taxonomy and the findings of this review.