Keywords

1 Introduction

It was a few years ago – in March 2020 – when the gaming platform Steam (© 2022 Valve Corporation) promoted its new Steam recommendation service (Steam 2020). It featured interactivity and was advertised as an absolute novelty. Users receive recommendations based on the games they have already played – a pool of potentially interesting games is generated and selected based on other players who have similar interests to the user. In addition, other recommendations can also be made: based on the games that friends play or by hints from curators. What exactly is new about this recommendation system only becomes clear with a look behind the screen, at the special filtering methods of the recommendation system and the algorithms used.

The idea of recommender systems in general is not new: The first recommender system was developed in 1992, but had little practical application due to insufficient computer processing power and limited data sources (Gahier and Gujral 2021). With the availability of higher-quality technologies that can process large amounts of data and the digitization of society, the recommender system has now spread to many areas of daily life: They have established themselves as robo-advisors in securities trading, accompany selection processes in human resources management and manage investments in media products (Linardatos 2020; Maume, 2021; Isaias et al. 2010; Barreau 2020; Fleder et al. 2010). Especially in the latter application area, the system is supposed to make a prediction about how strong a user’s interest in a (virtual) product is in order to recommend to the user exactly those products from the set of all available products that are likely to interest them the most (Mahesh and Vivek 2021). However, both the mass of people using the offerings and the number of objects to be recommended have increased in recent years; moreover, the interests of the providers of objects join the user interests: They want to be seen (Goanta and Spanakis 2020). In the field of computer games, especially the offerings of smaller game developers (so-called indie game developers) are brought to the screen of many users by the recommendation algorithms. The problem that arises due to the multitude of products and information, due to different filters and ambitions, is the prototypical “black box” (or sometimes “white box”) problem.

2 The Black Box-Problem of AI Applications

An overarching goal of recommender systems is to make a prediction that quantifies how strong a user’s interest in an object is, in order to recommend to the user exactly those objects from the set of all available objects in which the user is most likely to be interested in (Ziegler and Loepp, 2019). The quality of a recommender system as perceived by the user depends not only on the predictive quality of the algorithms, but also to a large extent on the usability of the system (Knijnenburg et al. 2012). Therefore, to determine the appropriate recommendations, the service uses machine learning and information retrieval methods (Zednik 2021; Mohanty et al. 2020; Silva 2019). Although users are usually interested in having a movie, product, or service recommended to them that is tailored to their interests (Schmidt et al. 2018), so they don’t have to search for it on their own and search out their own preferences from the almost infinite number of objects, they are usually not aware of why AI systems make decisions or engage in certain behaviors. In fact, most of these procedures lead to “black box” phenomena (Zafar et al. 2017): That is, knowledge – a model – is established through machine learning processes, but it is not explainable or comprehensible to the users of the systems, or at least only with great difficulty (Niederée and Nejdl 2020). Opacity can affect users’ trust in the system and lead to rejection of the systems, especially in contexts where the consequences are significant (Raj 2020; Burrell 2016; Ribeiro et al. 2016). For example, if AI is used in medicine and supports the attending physician in evaluating CT or MRI scans, the algorithm learns and can analyze the data faster than a human, flag tumors and suggest results of a therapy. However, the positive effect of constantly and rapidly growing knowledge and accurate classification of symptoms in real time (especially in cancer detection) also raises issues of trust. It remains opaque why AI can distinguish harmless cysts from malignant cancer – at least from the user’s perspective (which may include both the patient and the treating physician). Due to this opacity of the detection process and the risk of serious health consequences if misdiagnosed, patients do not trust AI. Instead, they prefer to trust the professional opinion of a human physician and their assessment of the need for therapy. Nonetheless, opacity is always agent-dependent, meaning that a computer system is not opaque in and of itself, but in relation to the actor using it (Humphreys 2008). The developer of an algorithm can understand its operation better than a user. The degree of opacity also depends on what kind of algorithms are used to generate the output. There are several possibilities of using search algorithms, such as linear models (i.e., logistic regressions), generalized additive models (i.e., GAM), decision trees, clustering (i.e. key nearest neighbors), kernel based methods (i.e., support vector machine), ensemble methods (i.e., random forest, XGBoost), neural network (i.e., CNN, RNN) (Niederée and Nejdl 2020).Footnote 1 The manifold possibilities of algorithms are also legally relevant and are put into a legislative context, such as in Annex I of the Artificial Intelligence Act (COM/2021/206 final).

2.1 Transparency and Explainability: An Introduction

In order to approach the “black hole” problem of gaming platforms, it is necessary to explain the “black box” phenomenon. In this respect, a distinction has to be made already between the opacity of the recommender systems on the one hand and the general opacity of AI systems on the other hand, each of which has a different dimension. For this purpose, transparency requirements and explanation intentions have to be set in relation to the current technical situation. It is obvious that the demand for transparency for all affected groups of people is not appreciated from every technical perspective. Nevertheless, the debate is important if the AI system is designed for interactivity – if users are allowed and expected to participate responsibly, they also need to have a basic understanding of the process.

2.2 Efficiency vs. Explainability of Machine Learning

The various applicable algorithms, are of varying effectiveness and also have varying degrees of transparency. While linear models (such as rule-based systems) or decision trees can be explained comparatively well (as Fig. 4.1 shows), the accuracy potential of these approaches is comparatively lower, neural networks are potentially more accurate. Of course, this depends on the application context – but in general this can be demonstrated in research (Körner 2020; Abdullah et al. 2021). There are several reasons: (1) Expressiveness: Similar algorithms can be used for an increasing number of domains and problems. For example, certain neural network architectures can be used for prediction, autonomous driving, pharmaceutical research, and particle physics alike. (2) Versatility: This allows different types of data to be used together and even multimodal approaches where different types of data are processed simultaneously. (3) Adaptability: Some of the approaches can be transferred with little effort.Footnote 2 (4) Efficiency: Special hardware has enabled corresponding models to be trained faster and more efficiently (Körner 2020).

Fig. 4.1
A graph plots accuracy potential versus explainability. The approaches are plotted in decreasing diagonal order as a neural network, ensemble methods, kernel-based methods, clustering, decision trees, generalized additive models, and linear models.

Different approaches to AI, as measured by their explanatory power and accuracy potential

However, these more advanced and precise methods of machine learning in general and neural networks in particular are less comprehensible than the simpler forms. The more simple approaches, such as the methods of rule-based systems or decision trees, can be explained in principle. For example, if a simple algorithm such as a decision tree is built into an autonomous vehicle that recognizes that you need to stop at a red light and reports this accordingly, most people can understand that. Sometimes such methods are referred to as a “white box”, although they are of course not comprehensible in detail to everyone – (for this reason Pasquale still refers to a “black box” in terms of the user perspective; Pasquale 2015).Footnote 3 Theoretically, the input and output data would usually be known to the user. In addition, the systems are comprehensible to the user to a certain extent due to their internal knowledge structure and the rules used for decision-making (see Fig. 4.2). Moreover, this applies regardless of the expertise of those on whom the system acts (Niederée and Nejdl 2020). With respect to neural networks, this is referred to as the “black box” problem. They are opaque and hardly explainable to the user, since a multitude of different paths is conceivable for the algorithms during decision making (see Fig. 4.3). However, this can lead to remarkable problems. Neural networks are being used in an evolving range of domains, and the resulting decisions and assessments are increasingly impacting critical areas relevant to the lives of the people involved, such as in medicine. Better understanding and explaining the results of machine learning have several benefits (Holzinger 2018). For example, it is interesting to know on which data the AI system’s decision is based – how reliable are they and of what quality? Also, how exactly the patient data (which one?) was matched with the training data. This would make it possible to check and evaluate machine decision proposals and assessments for their credibility. While symbolic systems can be examined line by line, instruction by instruction, in neural networks the symbolic representation of the knowledge and the start-up control disappear. The knowledge and behavior stored in the neural network can now only be inferred indirectly through experimentation (Ebers 2020). There are several reasons for this: (1) The strength of these types of networks is their ability to learn. Given a training data set that contains the correct answers, they can gradually improve their performance by optimizing the strength of each connection until even their top-level outputs are correct. This process, which simulates how the brain learns by strengthening or weakening synapses, eventually leads to a network that can successfully classify new data that was not part of its training set. Thus, they are not limited to human perceptual and communication patterns. This type of learning is partly why they are so powerful, but also why the information in the network is so diffuse: Similar to the brain, memory is encoded in the strength of multiple connections rather than stored in specific locations as in a traditional database (Castelvecchi 2016; Robbins 2019). (2) Furthermore, another property of deep neural networks is that they can also learn the features they use to learn for themselves; however, this extends the “black box” problem to the features they use and further complicates explainability (Niederée and Nejdl 2020).

Fig. 4.2
A diagram of a decision tree framework. The levels of succession from the top include the root node, decision nodes, and leaf nodes,

Example of a decision tree structure. (Sarker 2021)

Fig. 4.3
An illustration portrays a network of the primary neural layout, including input, procession layers, feedback loop, output, and result.

Basic neural network layout. (Uzair and Jamil 2020)

2.3 Background of the Transparency Requirement

Regulators are increasingly focusing on the objective of transparency of machine learning systems in general. With the draft of the European Artificial Intelligence Act published in April 2021 (COM/2021/206 final), the requirement of transparency and explainability was taken up again and readdressed in comprehensive regulations. Lipton (2016) has devised a taxonomy of methods and approaches that can be categorized within the realm of explainability and can facilitate a basic understanding of the terms used here. This can often lead to a linguistic imprecision regarding the meaning of explainability on the one hand and transparency on the other. According to Lipton, explainability is the overarching term for the two concepts of transparency and interpretability. The concept of transparency focuses strongly on the technology and algorithms involved, while the concept of interpretability is less technological and more likely to be found in specific contexts (Waltl 2019). Here, the focus is on human perception. Thus, when we speak of explainability in this context, it includes both transparency of the technical components and interpretability by the individuals using the system (Lipton 2016). In terms of the individuals involved, i.e., the addressees of transparency obligations, transparency becomes relevant at different levels (Anand et al. 2018): (1) Software developers and vendors need to understand how the concerned system works in order to fix any bugs and improve the system (Hohman et al. 2018). (2) Individuals affected by an algorithmic decision want to know and understand why the system reached a particular judgment, as this is the only way to detect any errors (in decision-making, in the basis for the decision, or in the evaluation of the decision). (3) Transparency allows legislators, regulators, certifiers, experts, courts or other neutral parties to assess the fundamental process and technical products (Rieder and Simon 2017; Ebers 2020). In addition, the technical consideration of the individual process steps is also important. “Procedure” in this context means procedures of automated decision making – so-called automated decision-making processes. In this procedure, the transparency of the processes becomes relevant on three levels: the process level, the model level and the classification level. The process level refers to various steps that an AI system needs to go through for training. This process usually includes five steps that immediately follow each other: Data acquisition, data preparation for the purpose of correcting incomplete or erroneous data, data transformation to unify them, training the AI model by optimizing mathematical functions and approximating the training data, and post-processing them (Ebers 2020). In terms of the requirement for transparency, it is important to know and understand each of these individual steps in order to understand the associated algorithmic decision-making process. The model level refers to the different types of machine learning methods used to make decisions (Ebers 2020). Analyzing this level is important because the different models have different levels of transparency. The classification level provides information about which attributes or features are used in the model and the weighting given to each attribute (Ebers 2020).

If transparency is to be explored in relation to a particular system and in relation to users, four questions should be asked: What is happening? Why is it happening? How does it work? and where does the process take place? (Zednik 2021; Tomsett et al. 2018; Marr 1982). Those who review, approve, and certify systems to ultimately determine functional safety need to know what is about to happen. They have to ask about what a particular system does – they are interested in what the system as a whole does and how algorithmic decision making occurs (Zednik 2021). Those whose data are processed and learned in a decision-making process, as well as those who are told that a decision is being made, want to know why it is happening. Those who use an algorithmic system for decision making (e.g., bankers who assess creditworthiness using an automated system) need to know why a system does what it does. For them, the interpretation of the behavior in relation to the specific facts of the case is relevant (Zednik 2021). Those who review, approve, and certify systems should know both why something happens and what happens. They and those who design, improve, and maintain the systems have to ask, at the algorithmic level, how a system works; at the implementation level, they have to ask how the program tries to realize the algorithms (Marr 1982). Thus, when approaching the question of the right level of transparency and finding the right way to inform and educate, these different interests need to be considered in relation to the different points of comprehensibility. Although the opacity of computer systems programmed with machine learning has traditionally been seen as the “black box” problem, in this sense it is perhaps more appropriate to speak of many “black box” problems. Depending on the perspective and the nature of the interaction with the machine learning program, the program will be opaque for different reasons and will need to be made transparent in different ways (Zednik 2021; see for a detailed analysis: Burrell 2016).

2.4 Criticism

Nevertheless, there are critical voices regarding the designation of AI-driven systems as “black boxes” and the associated demand for explanations and transparency (see Bryson 2019; in a normative sense: Robbins 2019). The demand for transparency is dismissed by arguing that an explanation of “how” a decision is reached is not helpful to a user, since the explanation of how the algorithmic decision is reached is difficult to understand anyway (Anderson 1972). It is necessary, but also sufficient, for those who program and use AI software to keep detailed records of how it works. They just need to ensure that appropriate care is taken (Bryson 2019).

However, that is not the point. Even if the algorithm has been learned thoroughly and, in this sense, complies with the normative requirements, the results it generates in adaptive neural networks such as recommender systems are not always predictable. In particular, when multidimensional learning data is the basis, new and different information that tends to be added randomly in application practice can be incorporated into the learning data that eventually affects the result in a lasting way that is unpredictable for the programmer (Zech 2019). This danger applies all the more if, as in the present case, the users participate in determining the decision parameters.

This goes along with the legally necessary requirements for the safety of products. For example, according to the New Legislative Framework system applicable in Europe (consists of Regulation (EU) 765/2008, Decision 768/2008 and Regulation (EU) 2019/1020.), safety-relevant products may only be brought onto the market if they have been tested and assessed as sufficiently safe by the manufacturer; beyond this, there is always an assessment of the legitimacy of such systems (Zech 2019). However, such an assessment requires prior understanding of the system used. Generally, this includes explaining the machine as such, but if AI is implemented, i.e., a tool or algorithm that controls the outcome of the machine, it also includes explaining those algorithms. Statements made by the manufacturer without knowledge of the system cannot be tolerated in the case of safety-relevant systems. However, the requirement of product safety certainly does not end with the mere development process up to placing the product on the market. The development of a normatively traceable product may relieve the developer, but it doesn’t indicate anything regarding the further consequences that may arise for third parties and the environment. Even if the developer of a system has operated in accordance with the regulations, this does by no means release them from further monitoring, e.g. of the consequences. This may, for example, be an expression of the product monitoring obligation as provided for in tort law (Ensthaler et al. 2012). Furthermore, explainability is not only important for the person placing the system on the market, but also for market surveillance. There, too, the processes need to be comprehensible if they want to check, among other things, whether the functional safety of the systems is guaranteed, which can be punished with fines. In this respect, explainability is essential not only at the level of the programmer, but also at the level of all other market participants, which is why the current EU legislations (COM/2021/206 final, Regulation (EU) 2022/2065) gives high priority to transparency of machine systems. Due diligence of users of AI systems is an important part of safety, but explainability is an additional – and above all essential – part. Thus, the legislature has made it clear that self-regulation is not effective and is not sufficient to protect users.

2.5 In Terms of Recommender Systems

Compared to the one-dimensional AI systems in the form of neural networks described so far, the associated “black box” phenomenon is amplified in recommender systems. The weightings made by algorithms can additionally be affected by user signals – and conversely, the recommendation system also has the power to shape and control user preferences and habits (Leerssen 2020). For this reason, it is particularly important in this context to look beyond pure algorithms and understand the complex interactions between technology and users. Several questions arise: (1) What algorithms are used to generate recommendations, i.e., what filtering method is used? (2) What recommendations are made? (3) What user content, metadata, or behavioral data feeds into the system? And (4) What human actors or organizational structures are involved in the process that will never be seen?

Especially in the area of recommender systems, providers have an interest in being cautious about the algorithms and filters they use and in not making these methods transparently available to users. Platform operators provide various reasons: (1) The platforms argue that the design of recommender systems involves commercially valuable trade secrets and that they would suffer economic disadvantages by publishing the methods. (2) Keeping the algorithms secret may be necessary in some cases to prevent users from undermining the gatekeeping function (e.g., abuse by spamming if the keyword blacklist is published). (3) User privacy may be compromised to the extent that the algorithm is developed based on user data (Leerssen 2020). The last point in particular draws attention to the socio-technical perspective: The meaning of algorithms is highly context-dependent, as the results of the system are co-determined by user behavior. Therefore, it is argued that in terms of transparency, the output should be explained first and from there the further recommendation patterns should be considered (Rieder et al. 2018). This includes, for example, the type of recommendations as well as user content or metadata or behavioral data, etc. In short, the user should know which of his personal data is linked to which data in the algorithm and through which linkage of these two pillars the recommendation emerges. However, this poses a particular challenge because the output of the system is not generalizable. Thus, in the context of recommender systems, it is not only unclear why a decision is made, but also which decisions are made (Leerssen 2020). In recommender systems, not only the code or data needs to be kept transparent, but also human and non-human actors need to be involved (Ananny and Crawford 2018).

3 The Black Hole-Problem of Gaming Platforms

The reason why we speak of “black hole” instead of “black box” in relation to gaming platforms is because of the multidimensionality of the gaming platform. Steam’s interactive recommender system, described at the beginning of this paper, says that it is no longer affected by tags or reviews alone, but learns through the games (Steam 2020). It analyzes the games users play and compares them to the gaming habits of other users. Based on this, the selection is to be tailored even more precisely to the user – alongside the possibility that the users themselves can add parameters by selecting that they would also like to receive recommendations based on friend preferences and those of curators. Basically, gaming platforms behave no differently than other platforms (such as Amazon or YouTube, see Covington et al. 2016; Davidson et al. 2010) in the way they are perceived externally. The distinguishing feature, as will be mentioned below, is the combination of different types of algorithms used in these cases and the disregard of some data in certain contexts. The platform offers three different components from which it draws and evaluates its information: a shopping component, a streaming component and a social media component. In the store, users can buy the respective games and then find them in the library. From there, they can be downloaded to their device. In addition, it is possible to watch live streaming offers for some games, which are reminiscent of video or streaming platforms such as YouTube (©Google Ireland Limited) or Twitch (©Twitch Interactive, Inc.). It is also possible to network with friends, which is necessary to play cooperative games together; however, it is also possible to network independently and follow the activities. All of this information that the recommendation system can receive from the different types of platforms, it processes to generate the recommended output. Recommendation systems on shopping, streaming and social media platforms are not new territory – but linking what is known on a platform offers special challenges. Here, we are dealing with learning, evolving and, above all, interacting neural networks.

3.1 Types of Recommender Systems

Recommendation systems can basically be divided into two types, provided that filtering methods are used as a distinguishing criterion: content-based and collaborative filtering systems. These are extended by a third category, that of hybrid methods.

3.1.1 Content-Based Filtering Methods

Content-based filter systems recommend to the user those offerings that are similar to those that the user has preferred in the past (Adomavicius and Tuzhilin 2005). If these filtering methods are to be classified into the categories mentioned above, this applies to the area of video and streaming platforms. In most cases, the system here consists of two neural networks – one to generate a “pool” of possible recommended content and one to further assess and rank the individual content from this pool. This two-step approach allows recommendations to be made from a very large corpus of videos or streams, while being sure that the small portion of output that is eventually displayed to the individual user is personalized and appealing to the user.Footnote 4 The main task of ranking in the second step is to specialize and calibrate candidate predictions (Covington et al. 2016). The main advantage of using deep neural networks for candidate generation is that new interest categories and new appearances can be added continuously. As accurate as deep neural networks may be, they are not just opaque due to fitting errors, but also require explanation.Footnote 5

The filtering method is primarily based on the comparison of articles and user information (Gahier and Gujral 2021). Based on this comparison, this method is further divided into user-centered and object-centered methods. On the one hand, the recommendations should reflect the user’s behavior and activities, but on the other hand, in the interest of visibility among providers, a set of content unknown to the user should be displayed (Davidson et al. 2010). This content-based system has some drawbacks: (1) Recommendations are also limited by the associated features. (2) Two videos with the same features are difficult to distinguish from each other (and often the videos are divided into multiple parts). (3) Only similar items can be recommended (Adomavicius and Tuzhilin 2005). Moreover, these data are highly noisy. The problem encompasses a variety of circumstances: Video metadata may be nonexistent, incomplete, outdated, or simply wrong. User data often capture only a fraction of a user’s activities and have limited ability to capture and measure engagement and satisfaction. The length of videos can affect the quality of recommendations derived from these videos (Davidson et al. 2010). In the context of live streaming, stream providers are additionally not available indefinitely (Rappaz et al. 2021). These circumstances may result in the recommendation being inaccurate after all.

3.1.2 Collaborative Filtering Methods

Collaborative filtering methods recommend objects in which users with similar evaluation behavior have the greatest interest (Adomavicius and Tuzhilin 2005). Here, no further knowledge about the object is required; the algorithm here refers to the user or acts element-based. The former algorithms operate memory-based, i.e., they make a rating prediction for the user based on the previous ratings. This prediction is evaluated as a weighted average of the ratings given by other users, where the weight is proportional to the similarity between the users. The model-based or element-based algorithms, on the other hand, attempt to model users based on their past interests and use these models to predict ratings for unseen items. These algorithms typically span multiple interests of users by classifying them into multiple clusters or classes (Das et al. 2007).

This method is commonly used in shopping platforms. Unlike the content-based filtering method, it does not necessarily rely on deep neural networks, but also works on the basis of linear models or symbolic AI. It can be guided by the user’s implicit feedback, e.g., transaction or browsing history, or by explicit interactions, e.g., previous ratings. However, this also highlights the drawbacks of this method: (1) New users of a platform do not yet have a basis on which to identify their interests, i.e., recommendations are not personalized. (2) New items are difficult to recommend if they have not been previously evaluated (Adomavicius and Tuzhilin 2005). However, this is inconsistent with the intentions of lesser-known vendors, such as indie game developers, described at the beginning of this paper. If there are no ratings for the unknown games, they are less likely to be recommended – but this is exactly why a recommendation system is used on game platforms, namely to not only focus on the big providers.

3.1.3 Hybrid Filtering Methods

Hybrid filtering methods first use the collaborative and content-based filtering methods just described separately and make predictions about user behavior based on each method. In a second step, the content-based features are included in the collaborative approach and the collaborative features are included in the content-based approach. In this way, the user of this method can create a general unified model that includes both content-based and collaborative features. These hybrid systems can also be augmented with knowledge-based techniques, such as case-based reasoning, to improve recommendation accuracy and solve some of the problems of traditional recommender systems (Adomavicius and Tuzhilin 2005).

Such a hybrid form of recommendation can be found in the field of social media platforms. Recommendations in these online networks differ from the previously mentioned platforms in that not only content recommendations, but also the social behavior of users is taken into account (Wang et al. 2013). Arguably, the most powerful influencing factor on social media platforms are the content recommendation systems that determine the ranking of content presented to users. These have a powerful gatekeeping function that is the subject of widespread public debate (Cobbe and Singh 2019). The system’s recommendations appear on the start page, disguised among friends’ posts – but the order of the news feed is determined by the ranking algorithms (Leerssen 2020). Two fundamentally different policies are followed: interest-based and influence-based recommendations. Interest-based recommendations aim to evaluate the relevance between a user and a piece of content, so that the content that is likely to interest the user the most is recommended (this is the focus of content-based recommendation). Influence-based recommendation examines what content is shared to maximize influence (this focuses on ideas of collaborative filtering) (Wang et al. 2013). Since the content recommender system mixes these two focuses as well as the underlying filtering methods, it is referred to as hybrid filtering. However, the problems and inaccuracies of each method described above exist here as well, so they may be exacerbated.

3.2 Black Hole Phenomenon

Gaming platforms combine the different platform types of shopping, streaming and social media platforms. This also connects the different types of filtering methods for their recommender systems. The “black box” problems that arise with recommender systems in general are amplified – it is unclear what the input or output is. It is also unclear what type of recommendation method currently has the upper hand in developing a game recommendation.

This is what we call a “black hole”: It is unclear in these evolving, interacting neural networks what inputs are being examined by the recommender system: Is it preferences for publishers or for genres? Is it reviews that users have submitted? Is it purchases and clicks? Is it what friends think is good or primarily what users think is good for them? What kind of output is being generated? What kind of recommendation system is being focused on? – The collaborative one of the shopping component, the content-based one of the streaming interface, or the hybrid one of the social media button? Which algorithms are preferred?

Or even: What information collected at any point is lost and disappears into obscurity, and what information is used to generate recommendations? Much remains unclear, opaque and therefore unexplainable to the user. However, this lack of transparency affects not only the user, but also, in a weakened manner, the platform operator who sets up the recommendation system on their platform. Unlike the user, the operator can estimate which algorithms are used in a specific case, but the learning ability of the systems and the varying weighting of the available data resulting from this also remain a “black hole” for them. Nevertheless, the legislator’s primary focus in the laws to be described in more detail below, the Proposal for an Artificial Intelligence Act and the Proposal for a Digital Services Act (Regulation (EU) 2022/2065), is to protect the user. For this reason, and since users are even more affected by opacity, the following considerations focus on them in order to solve this problem.

4 Legal Bases and Consequences

“Where opacity is the problem, transparency is the solution” (Zednik 2021). In recent years, voices have already been raised in the legal and policy literature taking this approach and proposing ways to eliminate opacity (e.g., Leerssen 2020). More recently, the call for transparency with respect to AI systems has also found its way into recent legislative proposals, as described above. Regarding the regulation of AI in particular, the European legislator has taken a major step in recent years and has been active legislatively by means of two legal acts that are worth taking a closer look at: the Digital Services Act (Regulation (EU) 2022/2065), and the Proposal for an Artificial Intelligence Act (COM/2021/206 final). However, it is the requirements of these two legislations that require a closer look in a second step in terms of appropriate implementation – especially with regard to the multidimensional opaque gaming platforms.

As mentioned at the beginning, the generic term of explainability can be divided into transparency on the one hand and interpretability on the other, with transparency referring in particular to the technical components and the algorithms (Lipton 2016). In their legislative proposals, the European institutions also refer to a concept of transparency that is in some respects prior to explainability (Berberich and Seip 2021). Since the legislator particularly focuses on the protection of the users of the platforms, we refer to user-oriented transparency in this context, which can admittedly only represent one aspect of the explicability of the entire system.

4.1 Legal Acts

Specifically relevant to recommender systems are two current legislative developments: the Digital Services Act (Regulation (EU) 2022/2065), which came into force on 16th November 2022 and will be fully applicable as of 17th February 2024, and the draft Artificial Intelligence Act (COM/2021/206 final). While the Digital Services Act broadly addresses transparency and impact of systems, in addition to definitions of terms, due diligence requirements, and enforcement mechanisms, the Artificial Intelligence Act focuses on the design and development of systems. Together, the two approaches can help address the black hole problem by developing solutions based on the requirements of the laws.

The legislation and its recitals make it clear, that European legislators believe that recommender systems have a significant impact on the ability of recipients to access or interact with information online. They play an important role in reinforcing certain messages, spreading information virally, and encouraging online behavior (see recital 70 of the Digital Services Act). With the establishment of recommendation systems, a completely new set of problems has arisen with regard to the amount of information conveyed online. This particularly applies because the system offers a large attack surface in terms of possible interference and therefore the risk of misuse is particularly obvious.

4.2 Digital Services Act

The Digital Services Act (Regulation (EU) 2022/2065) contains two fundamental assertions about recommender systems.Footnote 6 First, the already mentioned recital 70 clarifies that the recommendation system has a central function: “A core part of a very large online platform’s business is the manner in which information is prioritised and presented on its online interface to facilitate and optimise access to information for the recipients of the service. This is done, for example, by algorithmically suggesting, ranking and prioritising information, distinguishing through text or other visual representations, or otherwise curating information provided by recipients.” Second, Article 3(s) of the Digital Services Act provides, for the first time, a legal definition of recommender systems: “‘Recommender system’ means a fully or partially automated system used by an online platform to suggest in its online interface specific information to recipients of the service, including as a result of a search initiated by the recipient or otherwise determining the relative order or prominence of information displayed.”

4.2.1 Problem Description

Recommender systems have opened up a completely new field of problems: In addition to the undisputed benefits that recommender systems offer, they also open up the possibility of disseminating disinformation (which is not illegal in as such) and increasingly disseminating it to end-users by exploiting algorithmic systems (Schwemer 2021). The main addressees of the due diligence obligations of the Digital Services Act are therefore the recipients of the services provided by the recommender systems, i.e. the end-users of the platform. This is also supported by the wording of recital 70 of the Digital Services Act.

As positive as it is that the European Union recognizes the potential of recommendation systems, it is to be criticized that the regulations are to apply only to “large online platforms”. The legislator defines what online platforms are in Art. 3(i) of the Digital Services Act. According to this, other Internet intermediaries are exempt from the application of the standards for the protection of users, on the one hand, and platforms with less than 45 million users per month, on the other (cf. Art. 33(1) of the Digital Services Act). This may apply to gaming platforms such as Steam (Statista 2022), as it is the largest gaming platform in Europe, but not to other providers in this field. The concept of information in Art. 3(s) Digital Services Act is to be understood broadly, however. With the help of the wording of recital 70, the application of the law refers to the algorithmic suggestion, ordering and prioritization of information.

4.2.2 Regulatory Content Related to Recommender Systems

Large online platforms within the meaning of the Digital Services Act are thus subject to various transparency requirements with regard to the most important parameters of the automated, possibly AI-supported, decision. With regard to the details, a look at Art. 14, 27, 34, 35 and 38 of the Digital Services Act is recommended. Art. 38 (supplemented by Arts. 14 and 27) of the Digital Services Act requires that users of recommendation systems of large online platforms be provided with alternative options for the most important parameters of the system. This includes, in particular, options that are not based on profiling of the recipient. However, this key requirement necessarily assumes that the user knows and understands both the processes and the alternative options. It is unclear to what extent, with regard to a more detailed explanation of the circumstances, the obligation under Art. 14 of the Digital Services Act also applies. Accordingly, information about content moderation practices has to be provided, e.g., with regard to algorithmic decision making and human verification. In addition, the intermediaries addressed in Art. 14 Digital Services Act also have to act diligently, objectively in a proportionate manner, and with appropriate consideration of the rights and legitimate interests of all parties involved (including, in particular, the fundamental rights of users). However, and this argues opposed to the assumption of obligations with respect to recommender systems, this concerns the restrictions imposed, i.e., content blocking (Schwemer 2021). According to Art. 34 of the Digital Services Act, major online platforms are required to conduct an annual risk assessment to evaluate any significant systemic risks arising from the operation and use of their services in the European Union. In doing so, the large online platforms are required to consider in particular how their recommendation systems impact any of the systemic risks, including the potentially rapid and widespread dissemination of illegal content and information consistent with their terms and conditions, cf. Art. 34(2) of the Digital Services Act. Based on this risk assessment, Art. 35 of the Digital Services Act requires the large online platform to take appropriate, proportionate and effective measures to mitigate the risk, including adapting the recommendation systems.

4.3 Artificial Intelligence Act

The Artificial Intelligence Act (COM/2021/206 final) specifically addresses the regulation of artificial intelligence systems. This proposed legislation could also become relevant to recommender systems-particularly in light of the discussion about fairness, accountability, and transparency of certain recommender systems (Schwemer 2021). The proposed legislation follows up on the European Commission’s White Paper on AI by setting policy requirements to achieve the dual goal of promoting the use of AI and addressing the potential risks associated with it.

4.3.1 Purpose of the Draft Act

The Artificial Intelligence Act aims to establish harmonized rules for the development, marketing, and use of AI systems that differ in their characteristics and risks, including prohibitions and a conformity assessment system aligned with the European Product Safety Act (Council Directive 85/374/EEC). The majority of the wording of the Artificial Intelligence Act derives from a 2008 decision (Decision No. 68/2008/EC of the European Parliament and of the Council of 9 July 2008 on a common framework for the marketing of products, and repealing Council Decision 93/465/EEC, OJ L 218/82.) that established a framework for certain product safety regulations. The principal enforcement authorities used to review the requirements of the Artificial Intelligence Act – market surveillance authorities – are also common in European product law (Veale and Borgesius 2021).

The Artificial Intelligence Act defines AI system in Article 3(1) of the Artificial Intelligence Act as “software developed using one or more of the techniques and concepts listed in Annex I that is capable of producing results such as content, predictions, recommendations, or decisions that influence the environment with which it interacts with respect to a set of human-determined goals.” In addition, the European Commission distinguishes four levels of AI risk: (1) AI systems with unacceptable risks, which are prohibited; (2) AI systems with high risks, which are permitted but subject to certain obligations; (3) AI systems with limited risks, which are subject to certain transparency obligations; and (4) AI systems with minimal risks, which are permitted (Schwemer et al. 2021).

4.3.2 Regulatory Content Related to Recommender Systems

The proposal’s definition of artificial intelligence in Art. 3 No. 1 Artificial Intelligence Act is drafted quite broadly, so that at first glance recommendation systems also fall within its scope. Due to the risk management system pursued by the proposal, in which foreseeable and other emerging risks are to be assessed (cf. Art. 9 Artificial Intelligence Act), the question arises as to whether a recommendation system would be classified as high-risk. This question is addressed by Art. 6 of the Artificial Intelligence Act in conjunction with Annex III of the draft act: There, eight areas are listed in which the use of AI systems is considered risky. Insofar as a recommendation system is used in the context of legal information, this can probably be affirmed on the basis of the legal requirements. In the case of media and shopping platforms, however, rather not.

Nevertheless, there are transparency obligations throughout – regardless of which risk level an AI system belongs to. The provision of Art. 52 Artificial Intelligence Act is pertinent, which establishes the obligation to inform natural persons that they are interacting with an AI system, unless this is evident from the circumstances or context of use. For example, an advanced chatbot is required to carry the information that the interaction is not with a human being, but with the AI system (Schwemer et al. 2021). In addition, Art. 14 of the Artificial Intelligence Act requires that (at least for high-risk AI systems) human supervision should be present to prevent or at least minimize risks to the health, safety, and fundamental rights of data subjects.

4.4 Dealing with Legal Requirements

A consideration of the two draft acts illustrates that both address the concept of responsibility of recommender systems in terms of fairness, accountability and transparency – however, they are weighted and considered differently. The question therefore arises as to how sufficient transparency, measured against the legal requirements, can be ensured, particularly in the case of multidimensional platforms, which we have described with the metaphor of the “black hole”. How can platform providers disclose which form of recommendation systems are in the foreground and which type of algorithms bring the greatest possible success without putting themselves on display and without disclosing operational and also success secrets?

Not least with recourse to the European Union Regulation establishing a general framework for securitization and a specific framework for simple, transparent and standardized securitization (Regulation (EU) 2017/2402), a number of industry proposals have emerged on how transparency for recommender systems could be presented.

4.4.1 User-Oriented Transparency

User-oriented transparency could serve as the first proposed solution. This form of transparency aims to direct information to the individual user in order to empower him or her with regard to the recommendation system for its content. The overall goal of this form of transparency is to raise user awareness and inform them of the options available. This should help them develop their own preferences and consider personal values such as individual autonomy, agency, and trust in their decisions (Van Drunen et al. 2019; Leerssen 2020). This consideration takes into account the legal requirements of Art. 38 of the Digital Services Act, which requires transparency about key parameters and the possibility of alternatives, and underscores the “What does the system do?” question of a system’s user and stakeholders addressed above (Zednik 2021). Thus, in terms of the recommendation system on gaming platforms, users need to be informed whether the recommendation is derived from previous shopping, streaming, or social media activities, which is accompanied by information about what machine learning methods are used.

A similar type of transparency is also found in the General Data Protection Regulation (GDPR) in Articles 5, 12, 13, 14 and 22 GDPR. Specifically, it addresses the right to be informed about the parties that affect editorial decisions and the profiles that are created about groups of data subjects based on the data fed by algorithms, as well as the relevant metrics and factors of those algorithms (Van Drunen et al. 2019; Leerssen 2020). This addresses the very basis that also constitutes the “black hole” problem.

However – and this is also correctly pointed out by critics of the transparency requirement – this demand for user-oriented transparency is not entirely without problems. Due to the platforms’ users’ prior knowledge and understanding of how the recommendation system works and the algorithms and filtering methods behind it, it is difficult to present the explanations in a complete and comprehensible manner. According to the requirements of Art. 38 of the Digital Services Act, the user has to be provided with an alternative if he or she does not agree with the parameters used for the recommendation. Furthermore, it is certainly an ambitious goal to include personal values in the selection of recommendations – after all, these are subjective in nature. There is no question of reflecting public values. Nevertheless, especially in a niche like online gaming, it is a good start to involve users in deciding on good recommendations, to give them control, and to actively shape their rights to information about the process.

4.4.2 Government Oversight

Another conceivable option for enforcing transparency is government oversight. In this proposed solution, a public body would have the task of monitoring recommender systems for compliance with the transparency standards set by the legal framework or making proposals for their design (Leerssen 2020).

The idea of government oversight and regulation is not new: It can be found in the area of data protection and competition issues in many European countries. State supervision with regard to non-discrimination is also established in the media landscape, for example in the German Media State Treaty. With regard to gaming platforms, however, the question arises as to whether state supervision can be crowned with success, especially in such a peripheral area of media activity. Looking at the state regulation of German gambling law, there have been immense problems with the recognition of the regulations during implementation. Whether state supervision is therefore suitable for the niche area of gaming may be doubted.

In addition to the intentions of the aforementioned legislation (protection of users of online platforms), other interests can also be enforced here, such as public interests and concerns for the protection of minors. The approach of state supervision would therefore have the undeniable advantage that, due to the multitude of state resources, a body with sufficient expertise could be formed to adequately address the multitude of problems. However, this presupposes a statutory reporting obligation on the part of platform operators, which is also difficult to implement in other factual constellations (e.g., plagiarism control, defective products in online commerce). In particular, due to the special role granted to platform operators in the area of telecommunications law, such reporting obligations are difficult to implement (see Gielen and Uphues 2021; Spindler 2021).

4.4.3 Combination of the Two Approaches with Additional Experts

A third approach combines the two ideas above by ensuring the transparency of recommender systems on gaming platforms and having them jointly supervised and monitored by representatives of academia and parts of civil society. This allows for research into the use of recommender systems as well as their practical criticism and questioning (Leerssen 2020). This idea links to the aforementioned problems of the other two approaches and tries to reconcile them: The lack of user expertise is replaced by the insights that research partners bring to the field. However, the practical needs, especially in such a niche industry, are determined by the users of the platform. With the insights gained from information about how the system works, regulatory and recommendatory interventions can be provided.

However, this approach also has a drawback: The law itself limits the effectiveness of this method. Science needs a lot of data to effectively control and develop the system, so, easy access to data and openness in processing information is desired. At the same time, it is the intention of the legislator (see the General Data Protection Regulation, GDPR) to keep access to data to a minimum. The generated data may only be accessed with the appropriate legal permission. Without the consent of all users of the gaming platform, it will be as difficult as for other users to find out which filtering system and which algorithm combination leads to (which?) result. While the ideal of openness and transparency is advocated and is also important to strengthen trust in the system – it is impossible to look behind the scenes.

5 Implementation of the Proposed Solutions

However, despite all the theoretical considerations and the question of the appropriate group of experts for implementation, it should not be forgotten that a practical solution is also required. In the following, possible approaches are presented which have already been discussed in other application environments and which also represent valuable considerations for the case described here.

5.1 Standardization

One of the most important keywords that can be mentioned with regard to a possible solution is that of standardization, both in the sense of technical standardization, but also in the sense of legal standardization and regulation. If it would be possible to design a technically comprehensible solution that explains in an understandable way in which such multidimensional platforms as the gaming platform are recommended, and if it would be correspondingly clear what kind of filtering process – content-based, collaborative or hybrid – would be used, a first step would be taken. Both users and indie game developers would then be able to understand how the user interface is constructed. The “black hole” would then become a “black box” again – and even if this is not a satisfactory state, it is easier to manage because of the existing research results. Harmonization – standardization – of recommender system technology in this area of application would be extremely helpful. It would also support our argument for putting the users of the system at the forefront in terms of information and transparency – because then the system would be at least a bit easier to understand for the users. Standardization is also a suggestion that could and should be considered on the legal side. The advantage of standardization in application is obvious: Recommendation systems for platforms that combine multidimensional decision algorithms would, in this case, work the same way, proceed the same way, and platform operators would be equally committed to standardized transparency. Another advantage is related to the emergence of these standards: The expert group developing the standardization framework is composed of people with appropriate expertise; this group, supplemented by users of the platforms with appropriate expertise, can profitably monitor and evaluate the security of the IT infrastructure. The more diverse perspectives represented within the team, the more likely it is that the team’s work product will address all technically, ethically, and legally relevant aspects. Given these conditions, better standards can be set on the basis of the different know-how standards.

5.2 Control Mechanisms

The problem remains that every AI application remains a “black box”, even if the recommendation system is one-dimensional or technically standardized in a way that at least makes the uncertainty of the system’s nature clear to the user. This makes it all the more important to nevertheless control and understand the unpredictability of AI to some extent and thus make it manageable. For example, an internal control system could be built in to minimize risk (Bittner et al. 2021). This addresses platform operators’ worries that the recommender system can be abused if they publish its capabilities. It also raises user awareness by signaling, for example, that the system performs frequent backups. Misuse by users could be counteracted by platform operators regularly participating in training or using tools to control the misuse.

Many AI systems have different requirements. Since recommender systems have been recognized as an important tool by the European Union, a verification process could be developed to prevent transparency and protection against abuse of market power by the platform operator. Mathematical-statistical models can be used to detect and analyze errors and deviations in the model (Bittner et al. 2021). In this context, it is mandatory to adhere to the multiaudience principle during the development of the software in order to sufficiently ensure quality (Lindstrom and Jeffries 2003). At the same time, the continuous comprehensibility of the AI algorithm needs to be ensured: How are the algorithms structured? According to which rules can it learn? And – particularly relevant in the case of neural networks – how quickly can it evolve and are the control mechanisms then sufficient?

Assuming that these extensive considerations cannot be enforced globally, but that the guidelines should be at the national level, there is another helpful support: liability regulations. Although many discussions occur in the context of public law, the advantages of private law should not be dismissed. For example, a liability law framework can control risks to some extent. However, there are broader problems associated with this consideration, particularly issues of conflict of laws (Lutzi 2020). The extent to which liability law or even competition law regulations could support the above approaches (in the sense of reciprocal fallback regulations, see Gesmann-Nuissl 2020) remains to be examined.

6 Conclusion

It is well known that a certain opacity is inherent in an AI system based on deep neural networks. Already at this stage, the demand for explicability of such systems becomes loud. This is also a problem with regard to recommendation systems that recommend suitable (digital) products to the user, also by means of AI systems, and the legislator has also recognized this problem. In the Digital Services Act (Regulation (EU) 2022/2065), recommendation systems are explicitly named, defined and the need for transparency is explicitly demanded by law. The problem we call “black hole” represents a form of multidimensionality. By combining different filtering methods used in known forms of recommender systems, namely content-based, collaborative and hybrid filtering methods on a single (gaming) platform, we exacerbate the phenomenon of opacity of AI systems sometimes known as the “black box” problem. It is particularly important to look beyond the opacity of individual algorithms and understand the complex interactions between technology and users. The solution to this “black hole” problem needs to focus on all levels of transparency, which the European Union addresses in its legislations. This includes the algorithms used (simple decision trees or deep neural networks), the filtering methods used (content-based or collaborative) and, in particular, the type of recommendation and the content and data used for this purpose. A balanced approach between users, manufacturers, regulators, government and research is needed to address the problem of double opacity and ultimately to increase the confidence of users, but also of platforms, in this technology – which, after all, brings many benefits.

Knowing what is technically conceivable, and knowing that it is feasible to also technically implement and legally secure the specifications required by the legislative proposals, the Digital Services Act and the Artificial Intelligence Act (Regulation (EU) 2022/2065, COM/2021/206 final), will help us to design and standardize guidelines for transparent AI. All of this is also in the interest of legislators. The Artificial Intelligence Act explicitly requires transparency of any system, regardless of the risk level to which it is assigned to. This would be a first step to regulate and certify multidimensional recommender systems.

Legal requirements impose certain transparency requirements. To meet these minimum requirements, AI systems in general and recommender systems in particular (especially since they are mentioned by name in the Digital Services Act) need to have a certain level of security that grants transparency and, accordingly, explainability to the user. These legal requirements, which will come into force in the near future with the AI Act and the Digital Services Act, can be implemented in various ways for recommender systems. On the one hand, user-oriented transparency is conceivable, which has already been implemented to some extent on gaming platforms. This type of transparency is intended to empower the user to control and manage the content of the recommender system, allowing individual values to be better taken into account – but there is the problem that the user cannot fully grasp how the system works. Alternatively, a government authority could also exercise oversight (similar to data protection or competition law). However, the past has shown that specific areas of application (such as the area of recommendation systems in this case) are difficult to regulate, especially since this involves certain reporting obligations. Another solution would be to combine the aforementioned approaches – and to combine user (interests) and state supervision (interests). This would strengthen trust in the guiding hand of the state and the application-oriented representation of interests by joint expert committees. These bodies would then also be in a position to implement the issues addressed here, for example by means of standardization. This would help to meet the different requirements of the users and users of the platforms.