Keywords

1 Introduction

The presence and application of digital services has become an integral part of the personal daily routine as well as within business processes. In the last two decades an increasing number of companies have adopted a business model based on digital services attained from user data. Accompanying these developments are not only the changes in the business processes but also societal attention to the effects and the constitution of these services.

Moral and ethical demands for the design and application of digital technologies, especially in the machine learning context, are steadily rising and are resulting in regulatory activities in some countries. In this paper, we use a scenario case study of a recommender system to identify potential ethical issues within the composition and usage of this digital artifact, i.e., software or an information system, as well as the correspondent legislation in force. We explain what ethical aspects are addressed by legal requirements. We show that without social awareness and responsibility, legal regulations alone cannot guarantee socially compliant information technology (IT). Here, we use the scenario of a food recommender system, the FoodApp, that demonstrates the multi-dimensional effects that such a data-based information system can have. The widespread use of smartphones for purchasing goods and services has led to a debate in research and public about the effects of the recommender systems on users’ behavior and thus the potential for ethical considerations. The food recommender systems encapsulated in a digital application delude the relation between the consumer and the market by providing a physical good (food) in a short time after a virtual interaction (app-based selection and purchase). The effects of this interaction on the environment, the business and on the stakeholders involved is largely invisible for the consumer making this scenario helpful for the analysis of ethical issues. Since legal measures incorporate ethical values and norms of a society, this paper uses ethical analysis to identify issues that might occur when using a food recommendation system. The results of the legal analysis of this scenario are then used to identify the ethical issues that lack legal equivalents. These results can be used for public discussion about the scope and necessity of future legal regulations.

Hence, the contribution of the research is twofold. First, the analysis method using a combination of ethical and legal assessment is described as a part of the development process of an IT system. Second, the application of the combined ethical and legal analysis of a Machine Learning-based System (MLS) is presented with the focus on recommender systems and its implications are outlined. For the legal analysis, legislation of the European Union (EU), especially the General Data Protection Regulation (GDPR), is considered. The ethical aspects are analyzed using the data-oriented ethical analysis (Levina 2020) and relying on the ALTAI requirements as suggested by the European Commission (European Commission 2020) and subsequently restated as the suggestion for legislation in April 2021 (European Commission 2021). The ethical analysis looks at the data process within the system’s design and identifies some of the relevant points in the development process, where ethical questions arise and influence the design of the system. The legal analysis reveals where the emerging ethical questions are already legally regulated. While the presented results cannot be considered exhaustive as to the incorporation of ethical and legal issues during MLS design and use, it provides an exploration framework for designers, developers and users of such systems as well as for Legal and Information Systems Researchers (ISR) by offering an addition to the ISR research methods.

The paper is structured as follows: A brief overview of the state-of- the-art research on ethical issues related to recommender systems follows the scenario introduction of the FoodApp in Sect. 9.3. The ethical analysis and the legal analysis are conducted in Sects. 9.4 and 9.5 respectively, before the combination of their results is presented and discussed in Sect. 9.6. The paper concludes with an outlook on research perspectives.

2 An Example Application: FoodApp- the Application for Meal Delivery

For demonstration purposes of the application of the ethical and legal analysis, a scenario of a fictional food recommendation application, the FoodApp, is used here. The fictional app description is adopted from Levina (2022).

The FoodApp is a fictional application based on a three-sided digital platform that is implemented as a mobile app. It is a branch of the fictional large company Acima that offers on demand individual transportation provided by freelancing drivers. To further explore the transportation market, Acima started the FoodApp, a fast-growing food delivery platform connecting the customer, restaurant owner and the delivery partner. It allows the customer to choose from a large database of participating restaurants and order a menu to be delivered to the customer’s address via delivery partners. The eater can choose a specific delivery partner based on the ratings of the currently available partners. The payment process is integrated into the platform as is the real-time tracing of the order delivery.

The platform business goal is “fast and easy food delivery whenever, wherever”. To achieve this goal, a MLS, a recommender system, is used to provide the best food suggestions for the user in accordance to the indicated preferences and the order history. The business performance indicators for the FoodApp include the return and re-order customer rates, as well as customer number growth rates. The implemented ML-model is thus optimized to drive user’s re-ordering on the platform.

To use the FoodApp, the customer downloads it on the mobile device granting permissions for it to access the location of the device. Further, a profile including information on delivery address, name, e-mail and phone number is required. Payment methods and logging in to the payment provider is further required. No manual modifications concerning data collection by the app are possible. Then, the meal preferences such as preferred cuisine or menu item need to be indicated or a meal can be chosen from the provided suggestions. The first suggestions are based on the historical frequency of the orders made within the community in the area of eater’s location. A rating system for restaurant and delivery partner performance is implemented.

The platform gains revenue from the customer via convenience charge, fixed commissions and marketing feeds from the restaurants, while providing the assignments and the payment to the delivery partners, as well as the technical infrastructure for the platform participants. The application is a key driver of Acima’s revenue and is a fast- growing meal delivery service with over 15 million users worldwide. Additionally, the platform includes an app for delivery partners that provides the possibility to accept or decline a specific delivery job, monitor the revenues, rate the restaurant’s delivery process, as well as provide directions to the restaurant and to the eater.

3 Current Approaches to Ethical Analysis of Recommender Systems

Ethical issues in the context of IT-artifacts have gained increasing attention in research over the last decade. Paraschakis explores e-commerce recommender applications and identifies five ethically problematic areas: user profiling, data publishing, algorithms design, user interface design and online experimentations, i.e. exposing selected groups of users to specific features before making them available for everybody (Paraschakis 2016, 2017).

Milano, Taddeo, and Floridi conduct an exhaustive literature review of the research on recommender systems and their ethical aspects and identify six areas of ethical concern: ethical content, i.e. content that is or can be filtered according to societal norms; privacy as one of the primary challenges of a recommender system; autonomy and personal identity, opacity, i.e. lack of explaining how the recommendations are generated; fairness, i.e. the ability to not reflect social biases; polarization and social manipulability by insulating users from different viewpoints or specifically promoting one-sided content (Milano et al. 2019).

Milano, Taddeo and Floridi also show that the recommender systems are designed with the user in mind, neglecting the interests of the variety of other stakeholders, i.e., interest groups that are being directly or indirectly affected by the recommendation (Milano et al. 2019). Polonioli presents an analysis of the most pressing ethical challenges posed by recommender systems in the context of scientific research (Polonioli 2020). He identifies the potential of these systems to isolate and insulate scholars in information bubbles. Also, popularity biases are identified as an ethical challenge potentially leading to a winner-takes-all scenario and reinforcing discrepancies in recognition.

Karpati, Najjar and Ambrosio analyze food recommendation systems and identify several ethically questionable practices (Karpati et al. 2020). They name the commitment to already given preferences and thus to the values of the designers as a contradiction to the potential for ethical content. Privacy, autonomy and personal identity that the authors identify as potentially vulnerable and hence suggest need to be realized via an informed concern and a disclosure about the business model used. Opacity about the origin of the recommendations as well as of the criteria and algorithms used to generate the recommendations. Fairness, polarization and social manipulability as well as robustness of the system complete the list of identified ethical issues for a food recommender.

These approaches discuss ethical impacts of recommender systems from the perspective of the receivers of the recommendations. Milano, Taddeo and Floridi argue that the social effects such as manipulability and personal autonomy of the user are hard to address, as their definitions are qualitative and require the implementation of the recommender system in the context they operate (Milano et al. 2019), while Karpati, Najjar and Ambrosio offer a multi-stakeholder approach to address these issues (Karpati et al. 2020).

The data processing-centered approach to analyzing ethical issues suggested by Levina (2020) identifies the decision points during the MLS development, while advocating the inclusion of a laboratory phase into the system design to assess the potential consequences (see also Coravos et al. 2019). This research applies a data process-centered combination of ethical and legal analysis in the attempt to identify how or whether the identified threads to ethical values that can be realized via an MLS are being already covered by the legislation of the EU and where in the artifact design these can be addressed.

4 Ethical Analysis

Here the ethical analysis following the steps of data processing in MLS is applied. To understand ethical implication of a process or technology, specific values that are being affected by the technology deployed need to be taken into account. There are a number of values that can be considered, but here, we look at the values represented as requirements in the “Assessment List for Trustworthy AI (ALTAI)” (European Commission 2020) by the High-Level Expert Group on Artificial Intelligence of the European Commission and that also are the fundament for the regulation proposal for governing artificial intelligence technologies (European Commission 2021) being: Human agency and oversight; technical robustness and safety; privacy and data governance; transparency; diversity, non-discrimination and fairness; societal well-being; accountability. Those have been chosen specifically to assess ethics aspects of AI technology, and are therefore appropriate to our goal.

The data processing-centered ethical analysis puts privacy and data governance in its focus and thus allows a stringent analysis of the legal aspects along the General Data Protection Regulation (GDPR), EU legislation. The specific stages of the data processing considered here are: sense, including: collect data, detect data, and select (problem-related) data; transform, including: clean data, chose machine learning model, train model, integrate model into the software system; act, including generate recommendations or more general, a result based on the chosen ML-model; and the apply stage that considers effects on stakeholders by the results and the effects introduced by the ML-model definition and application. These data processing stages are based on the data management process by Shutt and O’Neill (2013). While the apply stage is not an integral part of the data process, the effects of the application of the MLS on user behavior are important for its design. Furthermore, it is differentiated here between the MLS-user, i.e., a person using the MLS directly, and an MLS-affected user, i.e., a person or a stakeholder that is affected by the effects of the MLS.

In the sense phase of data processing, it needs to be assured that the data have been collected with the informed consent and voluntariness of the data subject. The legal analysis presented in Sect. 9.5 elaborates further on these considerations. Informed consent also includes the statement of the purpose of the data collection implying an opt-in function for data collection. What data are being collected is normally described in the terms and conditions document of the recommender system. Their legal sufficiency is further discussed in the legal analysis. Often, under the claim of providing more personalized suggestions, profile data provided by the user as well as e.g., the location data automatically provided by the mobile device are also collected. Nevertheless, it is not clear to the user, when the location data are collected challenging the value of privacy, nor is it clear what role these data paly for recommendation creation, addressing the value of transparency. While opt-out functions are often cited by the enterprises producing data-based services as user empowerment (Abdelaziz et al. 2019), they are legally insufficient (see legal analysis) and shift too much responsibility towards the user (Milano et al. 2019).

FoodApp’s business goal is to engage the user in the re-ordering of food via its digital platform. The user interacts with the app aiming for a comfortable provision of the favorite food in an efficient way. Therefore, the user is inclined to give up some autonomy within this process. Nevertheless, in the digital realm the user is often not aware of what elements of his/her autonomy are jeopardized when the digital service, here food selection and ordering via a digital platform, is performed. While the user can still change the app’s settings, s/he is often unaware of the default access requirements. Data are the fundament for the further model building for the recommendation algorithm and their amount and sources are strongly defined by the business model that this MLS is supposed to support.

As FoodApp would like their users to return to the app, it will need among other factors, very good recommendation results as well as a frictionless ordering process together with a reliable problem handling mechanisms to fulfill basic customer expectations (Karpati et al. 2020). The first requirement, i.e., very good recommendation results in terms of user’s preferences, can be realized using a recommendation algorithm based on the collected data from the user as well as from the users with similar preferences or history on the platform (Karpati et al. 2020). Since the user activity data might provide additional patterns for the recommendation, it also provides a potential reason to keep the user engaged on the app for the longest possible time, which might involve the use of dark patterns in the app design (Gray et al. 2018). FoodApp’s user profile provides the information that is, among others, needed for the algorithms in the MLS to derive food recommendations. The user does not have any information about the exact purpose of the provided datasets, the data lifecycle, nor about who has access to the (possibly) un-anonymized profile or historical data and about the data state timeline, i.e., when the data are transferred or deleted. These aspects can be categorized as “transparency issues”, since the user does not have the information about FoodApp’s processes s/he might need or would like to have.

Aside from the user data, FoodApp’s database would include data on the restaurants available for ordering and delivery through the platform. Addressing the restaurants is part of Acima’s business model and might also be part of the business focus of the restaurants, as they can be included on the platform according to specific criteria, e.g., reviews on other platforms, personal preferences, number of years in business, etc. leading to a potential pre-selection of available food choice on the platform. Additionally, the delivery network of partners that will pick up food at the restaurants and deliver it to the customer’s door need to be established and equipped with the means to be contacted, paid and managed by the platform. Hence, FoodApp needs to establish an ecosystem, similar to a classic supply chain, to be able to fulfill its business goal or even to be able to operate according to its business model. Building up such an ecosystem as well as the potential to manage the orders for delivery, provides Acima as a digital platform with a specific power over the delivery partners as well as the restaurants that can have extensional effects on the partners involved in the ecosystem as well as the bigger area of stakeholders (Levina 2019).

The user can filter the suggestions within the FoodApp using the provided filter categories. These categories, defined by the MLS-engineers and designers, include cuisine and menu item names, as well as the ratings of the accordant restaurants. In future interactions with the FoodApp its home screen offers the meals and food items that are most frequently ordered by the eater or users that were identified to have a similar ordering behavior, thus nudging the eater to order the same or similar kind of food (Zhou et al. 2010).

Also, how the data are analyzed, i.e., anonymously or connected to the profile might raise further privacy and transparency questions. In cases when the recommender system dictates what data are collected, the mode of data collection and their importance within the algorithm are also decided without user participation. Hence, the question about the diversity and quality of data and consequently the ones of the recommendations arise. The lack of diversity can be manifested in other aspects that are indirectly related to the business goal, such as feedback from the stakeholders, the effects on the ecosystem the MLS is acting in as well as the potential impact of its recommendations on the environment. In a case of a food recommender as in (Karpati et al. 2020), these can be a considerable environmental impact in terms of air pollution (Saner 2020) besides the generated plastic waste (Zhong and Zhang 2019).

The transformation phase is the last phase within the data process where the fundamental decision to apply machine learning instead of a less resource-intensive solution to realize a business goal can be revised. Training and using an MLS requires increased power usage. Parameters within the MLS that lead to recommendations are designed and configured by data analysts and software engineers, thus reflecting their values and reducing the diversity of the recommendations as well as promoting decisional de-skilling (Floridi 2016) and increasing trust into the algorithmic suggestions (Gille et al. 2020; Krügel et al. 2021; Fritz et al. 2020) by the user.

To reduce energy and time resources to train the particular model in use, an existing pre-trained model form the application domain can be used. Although, the choice of the model often does not consider its explainability (Kamiran and Calders 2012) or its requirements for the resources. Its accuracy is often the main criterion, which might neglect user’s preferences. Also, observations on the effects of MLS application on user’s behavior rarely precede the going live of the MLS.

The FoodApp has based its business model on the data-based provision of food recommendations and the forwarding of the recommendations to the restaurants and delivery partners. Thus, being data-based, these business questions would require the use of data analysis tools, although the added value of the neuronal networks for the recommendations depends on the quality of data and the accuracy thresholds defined by the product designers.

The model quality is in the center of the ethical inquiry in the transformation phase. The set thresholds define mathematical methods, e.g., neural networks vs., e.g., support vector machines, and thus the resources needed to train the model as well as to generate the recommendation. The transform phase does not only include the training and optimization of the models used for the recommendation, but it also considers the inclusion of the ML-models into the information systems context.

While definition of food categories as well as the selection of the included cuisines and restaurants is part of the sense phase and especially the select sub-phase, questions in the transform phase focus on the mathematical transformation of these selected details. Inclusion of, e.g., nudging techniques is also part of the sense phase and the collect sub-phase, but it is strongly defined by the business model.

The act phase comprises the definite initiating of an action based on the provided MLS recommendation. The interaction with the recommender is often implemented as a “human-in-the-loop” interaction pattern that puts the human in control only over the recommendation-based action itself. It often leaves the user oblivious about the following process that was initiated by this action, leaving the user oblivious about the consequences that were induced by the action, as no feedback from outside the user-focused process is taken into account.

For Acima, the value is created when the food delivery order is completed in the FoodApp. Hence, the ordering process is organized in a way that no extended explications or additional information are given so that the user does not have to choose, decide or react during the interaction process. This design allows a fast phase-out between opening the FoodApp and ordering the food. This effect can be expected to contribute to user satisfaction and thus re-visiting the platform for the next order.

The process efficiency offered by the FoodApp is also built on the lack of decision possibilities and a limited items selection that is based on the historic and profile preferences for the user. Additionally, the gained comfort for the user in terms of food selection and delivery has implications on the ecosystem of the FoodApp. The restaurant partners will be faced with an increased amount of reviews from delivery customers, potentially forcing them to concentrate on robust packaging to ensure the sound condition of the meal for delivery. More robust packaging means more damage to the environment but potentially better ratings from the FoodApp users (Zhong and Zhang 2019).

Furthermore, the food recommendations based on historic and similar orders might lead to homogenization of the food offered in the participating restaurants, as menu items that are ordered less often might not be prepared by the restaurants anymore, potentially leading to the decrease or shift of skills of the cooking staff. The individual delivery of the food orders requires reliable and efficient delivery partners. Acima relies here on its network of drivers for personal transportation that are also incentivized to transport food orders via reward programs. This efficient and effortless process of ordering food for individual consumption can and does cause significant environmental damage in terms of air pollution through traffic and waste (Zhong and Zhang 2019).

Further effects on the social environment can also occur. The eater rates the restaurant on the food quality and the delivery partner on the quality of the delivery. The rating is based on eaters’ satisfaction with the end result, whereas the traffic situation and other external effects of the recommendation process are not considered. This relationship pattern causes societal effects that are visible in the traffic situation, environmental damages as well as reduction of labor costs and conditions and also affect user’s behavior (De-Arteaga et al. 2020).

While the analyzed aspects in previous phases were based on the data process as an enabler of the business value creation, the apply phase leaves the realm of the software system as an artifact and enters into the realm of its usage that can directly or indirectly influence behavior of individuals. One potential effect might be the reduced diversity in the choices that survive in such ecosystems due to the focus of the recommender system to provide suggestions based on e.g. item’s popularity, leading to the extinction of less favored choices as well as impacts on the collateral effects, e.g. effects on air quality, quantity of waste and quality of life via increased delivery traffic are commonly observed in the cities where food recommender are active (Karpati et al. 2020; Saner 2020; Zhong and Zhang 2019). Thus, even if the MLS is not directly critical to the human life, rights or well-being, there is a necessity to consider the societal effects of its implementation as outlined in the criticality assessment of algorithmic systems by the German Ethics Commission (Germany. Datenethikkommission 2018) that takes the effects of the application of an MLS on critical goods such as human life and wellbeing into consideration.

In the FoodApp scenario, rating of the delivery partners results in an increasing number of orders for high ranked drivers and in a reduction of delivery orders for the worse ranked drivers. Hence promoting the reviews into the main factor for job acquisition, and thus income, for the drivers. This type of job market is known as the gig economy (Friedman 2014). It provides income potential for the workers while creating an interdependency between the platform customer and the gig worker. This relation seems to remain unclear for the platform customer and is often debated by the platform owner (Susser and Grimaldi 2021). Consequently, the OECD stated in 2016 that digital platforms need social values to be reflected in the platform governance (OECD 2016).

5 Legal Considerations

Using MLS can cause novel challenges for the legal compliance. Ideally, there is a common goal between legal experts, ethicists and technicians: an optimized and legally compliant design and implementation of machine learning-based artifacts. To further this, some main aspects are presented using the food recommender system as an exemplary scenario.

5.1 Data Protection Law

The availability of large amounts of data are essential for machine learning. But when personal data are involved, the European General Data Protection Regulation (GDPR) comes into play. There are actually very few data, such as pure machine data, which no longer have any personal reference. Personal data refers to all information relating to an identified or identifiable natural person.Footnote 1 This broad definition covers all information that can somehow be attributed to a specific person. Even pseudonymous data, like an IP address, is classified as personal data, because, although not directly, it can be attributed to a natural person by means of a transfer.Footnote 2

In the FoodApp, user, restaurant and delivery data are collected, selected, processed and stored. All this constitutes data processing in the sense of data protection law. It is clear – not only because of the tremendous possible penalties and finesFootnote 3 imposed by the GDPR that it is absolutely necessary to plan MLS such as FoodApp in compliance with data protection regulations. Albeit, the issues of big data, Machine Learning (ML) and algorithms were not explicitly addressed by the GDPR. Here very few references can be found, e.g., Art. 22 GDPR, which regulates automated individual decision-making, including profiling. Even though Art. 22 GDPR appears to be extremely relevant to ML- processes at first sight, its legal scope is actually rather limited. According to the prevailing and convincing view,Footnote 4 Art. 22 GDPR applies only to those automated processes, which are intended to evaluate certain characteristics and features of a person, despite its rather open wording.Footnote 5 Hence, the data process itself is subject to the general provisions of the GDPR, which were not specifically designed for ML-processes. Subsequently, some points of relevance are presented.

5.2 General Principles and Lawfulness of Processing Personal Data

First, Art. 5 GDPR is to be mentioned, which lays down the general principles of the data processing such as lawfulness, fairness and transparency. Art. 5 No. 1 a GDPR reads: “Personal data shall be processed lawfully, fairly and in a transparent manner in relation to the data subject”. Equally relevant is Art. 6 GDPR, which regulates the lawfulness of processing. Art. 5 No. 1 a GDPR reads: “Personal data shall be processed lawfully, fairly and in a transparent manner in relation to the data subject”. The main scope of the transparency requirements is defined by the information obligations in Art. 12 and Art. 13 GDPR, for the exercise of the rightsFootnote 6 of the affected data subject. Art. 12 GDPR initially states that all information and communications must be provided in a precise, transparent, comprehensible and easily accessible form in clear and simple language. When data for the usage in ML are collected, affected persons need to be explicitly informed about this fact. Thus, the FoodApp should provide explicit mentions of the data being collected and the purpose the data is being used for.

However, transparency seems to be relatively difficult to fully accomplish in this context, because this also requires explainability of the decision process in algorithmic decision procedures. It is essential that users of algorithmic systems are at least able to understand, explain and control their operation, and that those affected receive sufficient information to enable them to exercise their rights in Art 12–23 GDPR properly. It appears to be excluded, that the process can be reproduced in all details afterwards.Footnote 7 So it is to be assumed that only the principle underlying the algorithm, i.e. the basic assumptions of the logic underlying the algorithm, but not the concrete computation formula or the algorithm itself must actually be presented to fulfil the information obligations in Art. 12 and 13 GDPR.Footnote 8 The biggest difficulty of transparent presentation here probably lies in describing the complexity of the analytical procedures used in such a clear and easy way that their scope is somehow tangible for the person affected.Footnote 9 On the other hand, companies also may claim that the disclosure of the algorithm and data categories would affect their business model by hinting to their business secrets.Footnote 10 In order to balance the interests of the MLS- user and the MLS- affected user, the provisions of Art. 42 Paragraph 1 GDPR, as well as data protection certification procedures and data protection seals and test marks, might represent a suitable instrument for proving compliance with the GDPR by the person responsible or their contract processors.Footnote 11

5.3 Lawfulness

In addition to the transparency and fairness of the data processing, it must also be lawful under Art. 5 and Art. 6 GDPR. Processing personal data are generally prohibited, unless the data subject has consented to the processing or one of the other legal bases stated in Art. 6 GDPR apply. It seems to be tempting to regulate data processing for ML-Systems in general terms of use or general terms and conditions. If customers want to use a certain service or app, which uses ML, he or she simply has to agree to the terms and thus also to the processing of data for ML-purposes.

Nevertheless, an opt-in or even opt-out in terms of use is not equivalent to informed consent. The basic requirements for the effectiveness of valid legal informed consent are defined in Art. 4 and Art. 7 GDPR and specified further in recital 32 of the GDPR. In order to obtain freely given consent, it must be given on a voluntary (free) basis, which implies a real choice by the data subject. Therefore, the tying prohibitionFootnote 12 applies: the fulfilment of a contract may not be made dependent on the consent to the processing of further personal data which are not necessary for the fulfilment of this contract. Though one can already doubt the voluntary aspect here, there is again a major problem in particular with regard to the transparency of information. For consent to be informed and specific, the data subject must at least be notified what kind of data will be processed, how it will be used and the purpose of the processing operations.Footnote 13 If all this is applicable, an opt-in to general terms of use/terms and conditions could be sufficient. An opt-out, on the other hand, is never sufficient, as it must be an unambiguous action.Footnote 14

Secondly, it is questionable whether another legal basis would apply unless consent can be assumed. It would be conceivable in particular that the use of ML is necessary within the meaning of Art. 6 No. 1 b GDPR for the performance of a contract to which the data subject is party. That means, the ML algorithm must be used and also be necessary to enable the service provision to a customer, e.g., food recommender. However, ML will probably be used more often to optimize processes, but not to make them possible in the first place. Another commonly used reason for lawful data processing can be find in Article 6 paragraph 1 lit. f: “processing is necessary for the purposes of the legitimate interests pursued by the controller […].” If a company argues that it has a legitimate interest in using ML, because its services would otherwise not be competitive, for example, this might actually be sufficient in an individual case. As the legal assessment depends on the individual case and interpretation, many uncertainties remain.

5.4 Purpose Limitation and Access to Data

Art. 5 No. 1 b GDPR requires a purpose limitation, which means that data may only be processed for specified, explicit and legitimate purposes and must not be processed beyond those purposes. Thus, the purpose legitimates the processing of the data in terms of necessity, adequacy, completeness and duration of the processing.Footnote 15 Once the purpose is fulfilled, the collected data must be deleted, since the purpose cannot be modified. This is only exceptionally possible if the present and initial purposes are compatible (Art. 6 paragraph 4 GDPR). In order to ascertain whether a purpose of further processing is compatible with the initial purpose, the controller should take several factors into account, e.g., the context in which the personal data have been collected, the nature of the personal data and the consequences of the intended further processing for data subjects.Footnote 16 This is interpreted rather restrictively. In this context, it is important to also point out, that the GDPR does not take a stance on the issue of “data ownership”. Therefore, the question arises, who actually “owns” the original and generated data, has access to them and can use or even sell them. According to the prevailing opinion “data ownership” does not exist.Footnote 17 Rather, it seems to be the current consensus to realize data sovereignty in each individual case through contractual agreements between the parties involved in a data exchange. This further underlines and demonstrates the absolute importance of purpose limitations as it restricts the power of disposal and the data subject retains his or her sovereignty.

5.5 Data Minimization and Storage Limitation

In order to derive accurate results using MLS, considerable amounts of data are required. These data need to be collected, selected and also stored to be usable. Therefore, the data collection in the FoodApp is already conceptually opposed to the principles of data minimization and storage limitation in Art. 5 No. 1 c, e GDPR as the data is being collected without the naming of the purpose and without its clear attachment to the business model. According to the principles mentioned above, only as much data as absolutely necessary may be collected and stored for as long as absolutely necessary. So, how can a balance be struck between the conflicting goals of large data collection as exercised by some data-based information systems and data protection? The principle of data minimization obliges data controllers in Art. 25 GDPR to design their systems technically in such a way that the risks for data subjects are minimized (privacy by design) and that default settings ensure that only personal data that are necessary for the purpose are processed (privacy by default). As stated in Art. 25 (2) GDPR privacy by default should ensure that personal data is processed with the highest privacy protection. Hence, personal data is made accessible to a definite number of persons and only personal data that is necessary for a specific reason shall be obtained. The principles of data minimization and purpose limitation relate to the concept (Ježová 2020).

As a consequence, the FoodApp would have to provide a data-protection-friendly default setting and offer the user a detailed selection option. In addition, an individual setting by the user, which can be made at any time, must be possible.

Yet here another divergence is evident – manufacturers or software developers de facto have influence on the data protection conformity of data processing processes and are able to implement “Privacy by design”. However, manufacturers or software developers, are actually not responsible in the sense of the GDPR, but only possibly commissioned data processor (Art. 24 GDPR). The Controller, the one that uses MLS, stays responsible. In this context Art. 28 GDPR sates that the controller shall use only processors providing sufficient guarantees to implement appropriate technical and organizational measures in such a manner that processing will meet the requirements of the GDPR and ensure the protection of the rights of the data subject. But this actually does not seem to be sufficient as no real obligation is imposed. The manufacturer or software developers should thus be somehow included in the scope of this provision.Footnote 18

Art. 25 paragraph 1 GDPR describes that pseudonymization might help to effectively implement these data protection principles and thus to protect the rights of the data subjects. The exact way in which the legislator envisages this is not explained in any detail, not even in the recitals. Art. 25 GDPR does not provide a solution for the principle of storage limitation, but the legislator has written specific requirements in the recitals of the GDPR: The personal data should be limited to what is necessary for the purposes for which they are processed. In order to ensure that the personal data are not kept longer than necessary, time limits should be established by the controller for erasure or for a periodic review.Footnote 19 So, the principle requires erasure routines that take effect at regular intervals and prevent endless storage.

5.6 Accuracy, Security and Impact Assessment

Data allow a reconstruction of an individual’s characteristics or information and must therefore be accurate in order to permit such a reconstruction. The principle of accuracy is concretized in Art. 5 No.1 d. Accordingly, every reasonable step must be taken to ensure that personal data that are inaccurate, with regard to the purposes for which they are processed, are erased or rectified without delay. Therefore, there should be an obligation to evaluate the systems on a regular basis, as is the case for the security measures under Article 32 GDPR.Footnote 20 In its recitals, the legislation makes clear at various points that strict security measures are important, because personal data processing can lead to severe physical, material or non-material damage.Footnote 21 In order to maintain security and to prevent processing in infringement of this regulation, the controller or processor should evaluate the risks inherent in the processing and implement measures to mitigate those risks, such as encryption.Footnote 22 In Art. 35 GDPR, the legislation provides an instrument for the evaluation of risks, the so called “data protection impact assessment”. A data protection impact assessment must be established for sensitive processes before they are introduced. The legislator has hereby provided for a risk-based “technology impact assessment”, which serves to take a closer look at the effects of the use of new technology on society and the environment.Footnote 23 The outcome of the assessment should then be taken into account when determining the appropriate measures to be taken in order to demonstrate that the processing of personal data complies with the GDPR.Footnote 24 However, MLS that are trained using external data, might be very easily manipulated. Since the MLS decision criteria are often unknown, it is difficult to foresee and block all conceivable manipulation attempts. Accordingly, it is to be expected that MLS will be subject to special security requirements. Here, special certifications and audits could be helpful in the future.

6 Results of the Combined Ethical and Legal Analysis Approach

The ethical analysis was conducted along the data processing steps of an MLS and under the consideration of the ethical values suggested in ALTAI. Ethical aspects discussed here accorded to the ones previously identified by Paraschakis (2016, 2017) for recommender systems in e-commerce, such as user profiling, privacy and online experimentation and Karpati, Najjar and Ambrosio for a food recommender system (Karpati et al. 2020).

Also, some of the identified ethical issues can be associated with the ethical areas of concern found by Milano et al. (2019) such as social manipulability. Additionally, using the FoodApp scenario some aspects were found that cannot be easily added to one of the categories in the previous work, such as environmental concerns induced by the increasing traffic due to the individual food delivery and packaging waste. Questions related to the use of digital applications for individual services resulting in the appearance and stimulation of the gig economy were also evident in the FoodApp scenario. Here, the delivery workers were a major part of the business model, while also facing uncertain labor conditions and minimal autonomy within their employment (Rosenblat and Stark 2016; Doteveryone 2019; Aguiléra et al. 2018). The last issue is being addressed by the European Commission and resulted in a directive that defines platform work as well as the workers’ rights, especially in the context of algorithmic decisions (European Commission 2021).Footnote 25

Using the FoodApp scenario several ethical issues were identified. The legal analysis showed that most of these issues have corresponding legal requirements that need to be implemented into the digital product for compliance. On the other hand, the combined ethical and legal analysis identified some ethical issues that are not yet included in the considered legislation.

The FoodApp scenario demonstrated the lack of transparency in the collection and use of user data. The purpose and the lifecycle are not visible for the user, depriving him or her from autonomous decision making about the data collection. An informed choice about the impact on user’s privacy is thus not possible. These ethical issues are addressed by the GDPR in the Art. 5 that requires specified, explicit and legitimate purpose for data collection. Art. 12 GDPR requires transparent information, communication and modalities for the exercise of the rights of the data subject. The lack of possibility for opting out of specific data collection and therefore the ethical issue of autonomous control over the algorithm parameters of a recommender system. Articles 25 and 32 GDPR require privacy by default and opt-in for the specific data aspects to be collected, making the opt-out option not sufficient for the compliance. Art. 17 of GDPR states the right to erasure by the data subject, thus allowing the user to end the use of her or his data by the recommender system algorithm. Another transparency and privacy aspect identified in the FoodApp scenario is that the FoodApp is part of the Acima enterprise. Since the user does not have any information about the data lifecycle, it is fair to assume here that Acima stakeholders or products can have access to FoodApp user data. Legal analysis showed that this ethical issue of accessibility is addressed by the GDPR Art. 29 and 5 that require contractual agreement between data subject and controller or controller and processor.

The FoodApp scenario demonstrated the lack of a feedback loop from the app’s stakeholders. The ordering and delivery processes are not reflected upon with the restaurants involved. The waste and traffic issues are not included into the recommender algorithms as well as are not monitored together with the local authorities. On the other hand, FoodApp’s algorithm is optimized for re-ordering, hence for the increase of individual deliveries and waste associated with these deliveries. Hence, the values of social and environmental well-being and the associated ethical issues such as waste reduction, reduction of traffic and fuel consumption are not considered. Also, no legal consideration was identified that considers these issues in the digital consumer context. While FoodApp’s business model heavily relies on the network of the delivery partners, the job assignment algorithm is not known for the delivery persons and the effect of the user reviews on the job assignments is not transparent neither for the users nor the delivery partners. These ethical issues of fairness and job accessibility are also not addressed by a legal requirement.

Autonomous decision making as well as the tendency for ethical consumption is endangered in the context of the FoodApp by the optimization goal of re-ordering and the given choice of restaurants and cuisines. The user is not enabled to make suggestions about restaurants or cuisines or forage for new options, engraving the skills and tastes within the app. The working conditions of the delivery persons are not visible or known to the user as well as the rewarding mechanism, e.g., tips, are not made available. In this process, the user is detached from the physical part of the service.

Combining legal and ethical analysis shows that some of the identified ethical issues are already covered by existing legislation. Nevertheless, bigger negative effects such as the effects on the environment or the society are part of the social awareness and responsibility that are not (and maybe should not be) regulated, but can be supported by socially acceptable IT artefacts.

As the legal norms can be implemented differently by those responsible for business and design decisions, as they do not provide specific processes for the implementation into the digital technologies. Ethical issues and views are more complex and disperse than legal norms and as such their implementation into the digital products can be less explicit. Therefore, we introduce and use the terms of socially aware IT, as such a system would consider and integrate the legal and ethical requirements into the design of the information system. The added effort could lead to a socially acceptable IT product. To ensure the remaining and homogeneous quality adherence, inter-company assessment mechanisms could be put in place.

Ethical issues that occur due to the use and implementation of digital products have been identified in research over the years. The FoodApp scenario analysis has demonstrated some of the complex effects that MLS can introduce as well as the resulting ethical issues. While some of these ethical issues were converted into legislation in some countries, e.g., within the European Union, others are actively debated in terms of the regulatory needs. ALTAI provides the set of ethical values that need to be considered when an MLS is being designed or used. While ethical values are not as binding as legal requirements, some of them are already incorporated into the legislation, i.e., the ones that are provided by the European Commission. The ethical analysis would uncover the ethical issues that might not yet have a legal equivalent. It is a matter of democratic discussion to decide whether and which ethical issues will need legal regulation and which ones can rely on the social contract.

7 Conclusion and Outlook

In this paper a combination of legal and ethical analyses was presented and applied to use case of a food recommender system, the FoodApp. This analysis approach showed aspects for ethical concern, such as decisional deskilling, emergence of structures of economic dependencies as well as additional effects on the environment induced by the individualization of the recommended service. The data-based approach chosen here presents an actionable radius for the system engineer and data analyst to include ethical and legal compliance during the design process of the ML-component by identifying operationalized ethical issues within software development.

Legal analysis of the FoodApp scenario showed that many of the ethical concerns are already addressed by the European GDPR legislation. However, targeting a broad area of data processing applications, the GDPR depends on interpretation, future jurisprudence or even new, more detailed legislation. Especially, the values of social and environmental well-being and even some of the aspects addressing human autonomy and oversight might need a legal fundament.

Although, the study showed that users of digital services may have expectations that are in part already covered by legal regulations but some of the identified ethical issues also rely on the ethical awareness of the company and thus go beyond legal compliance. While the ethical analysis revealed issues that go beyond the existing regulations on data protection, legal analysis showed the range for the interpretation of the legal regulation. Also, issues that are a matter of business ethics rather than legal regulation such as the awareness of the environmental impact and on the labor market originating from the broad usage of the digital platform were identified.

The combined analysis showed that MLS, specifically the examined food recommendation systems, have effects that do not only concern data processing, but that are rather beyond the direct interaction between the user and the system. While the GDPR addresses the data processing aspects such as user privacy and transparency, the effects of the usage of a food recommender system require an interdisciplinary discussion about the need of further regulation.