Keywords

1 Society 5.0 and Recommendation AI in Japan

In the wake of social issues where knowledge and information are not sufficiently shared in a society, Japan’s Cabinet Office has proposed a vision of a future society called Society 5.0. As a follow-up to the hunting and gathering (Society 1.0), agricultural (Society 2.0), industrial (Society 3.0) and information societies (Society 4.0), Society 5.0 aims to develop the economy and solve social issues by integrating cyberspace and physical space (Cabinet Office 2016).

Society 5.0 was announced as part of Japan’s Fifth Science and Technology Basic Plan in 2016 and continues to be referred to as the vision of the Japanese government and business entities. For example, in 2018, 2020, and 2022, Keidanren (Japan Business Federation) issued its ‘Healthcare in the Age of Society 5.0’ proposal, presenting a vision of a society that offers new options for diverse needs in health management and medical treatment (Keidanren 2022). It proposes that ‘health management can be achieved by making appropriate recommendations suited to the situation at the time, without having to make a difficult decision’, and it is expected that artificial intelligence (AI) will be customized to recommend public actions.

While recommendation AI is expected to be used in many fields to realize Society 5.0, there are also many challenges. Some people may psychologically resist recommendations made by AI. It has been pointed out that that especially in Japan, elderly respondents showed negative attitudes towards recommendation AI services (Ikkatai et al. 2022). Data management is also a challenge for AI to provide information customized to individuals. For example, there was a controversy in Japan surrounding the use of data of job-seeking students by companies without the students’ consent in 2019 (Kudo et al. 2020). Whether data is appropriately managed or used for purposes for which it was not intended can influence people’s decisions to use AI services. In this paper, we present the results of a survey that examined the perspectives of users in deciding whether they would like to use the recommendation AI services.

2 Model for Ensuring Trustworthiness of AI Services

Ensuring trustworthiness of AI services has been an important topic in recent years. In Europe, a report on Trustworthy AI was released by the European Commission in March 2019. Simultaneously, Japan’s Cabinet Office released ‘Social Principles of Human-Centric AI’, which include building a mechanism to ensure the trustworthiness of AI and the data or algorithms that support it (Cabinet Secretariat 2019).

Incorporating various AI principles into practical processes has been gaining increasing attention in recent years (Jobin et al. 2019). Trusted AI is achieved not only through technology but also through an ‘AI governance ecosystem’ that includes various elements such as guidelines, auditing, standardization, user interface design, literacy and education (Japan Deep Learning Association 2021), and various models for ensuring trustworthiness have been proposed.

The DaRe4TAI framework by Thiebes et al. examines whether privacy protection and discriminatory judgements occur at all stages of data input and output and AI model building (Thiebes et al. 2020). In Japan, several frameworks have been published to promote trusted AI in society. For example, the Ministry of Economy, Trade and Industry (METI) released the ‘Governance Guidelines for Implementation of AI Principles, ver.1.1’ in 2022, which presents action goals for AI providers to implement (METI 2022). An increasing number of companies are establishing their own AI principles and guidelines; Fujitsu Limited, for example, has developed a method to disseminate trustworthy AI to society and published its procedures and application examples (Fujitsu Limited 2022). Research is also being conducted at universities, and the Risk Chain Model (RCModel) is structured into three layers: AI model (e.g., accuracy and robustness), system (e.g., data and system infrastructure), service provider (e.g., behavioral norms and communication) and user (e.g., understanding, utilization and usage environment) (Matsumoto and Ema 2020). The framework guide and case studiesFootnote 1 are available for considering ways to ensure the reliability and transparency of AI services by structuring them into these three layers.

By establishing such ethics policies and guidelines, AI service providers are trying to ensure social trust, and publicize that they are developing and using technology in an appropriate manner. However, having policies and guidelines is not the same as being a truly trustworthy organization. The issue of ‘ethics washing’ has been pointed out: an organization may claim to behave ethically, but it may be just a façade or a ‘smokescreen to hide their transgressions’ (Dand 2021). Therefore, while it is important to have these principles in place, it is also important to communicate appropriately with users.

3 Components of a Trustworthy AI Model

While there are several models and frameworks for ensuring trustworthiness of AI services, this paper proposes a model that places particular emphasis on the interface between the service provider and the user. In particular, we investigate not from the perspective of what technical requirements are necessary for AI technology to be trustworthy, but rather from the hypothesis that how the form of communication between humans and AI is designed may affect whether users trust and use it. For example, even if the AI services offered by companies are the same, user preferences will vary depending on how purpose of use is explained and how much freedom of choice users are given.

There are many studies on AI-human interface design, especially on the general bases for human trust in automated machines such as AI (Madhavan and Wiegmann 2007). Research points to the performance of the machine (e.g., abilities), the process of how the machine works, and whether its design achieves the designer’s purposes as the bases for trust (Lee and See 2004). To gain the trust of users as well as designers, it is important that AI is in accordance with their preferences. Trust in machines is not only related to the machine, but also to the governance of the company, such as whether data is appropriately managed. Therefore, we conducted a survey based on the hypothesis that three aspects of interaction are important: (1) AI intervention, (2) data management and (3) purpose of use. Although a variety of AI services are currently available, this paper specifically investigates recommendation AI as a case study.

3.1 AI Intervention

Service providers need to understand users’ preferences for AI services. For example, some users may not want AI to make decisions and prefer human judgement, while others may prefer to use AI services only as a reference but only if a human makes the final decision. Conversely, others may prefer to have AI alone make decisions without any human intervention.

Thus, the degree of AI intervention varies widely, and it may be difficult to tell the difference between a pattern in which decisions are made by humans alone and a pattern in which AI is used but final decisions are made by humans unless the process is clearly explained. Therefore, from the viewpoint of transparency, service providers are expected to log the process by which decisions are made and to establish organizational governance to provide AI services. Users are also expected to use AI based on information and explanations provided by developers and service providers in some cases, as considered in the principles of proper utilization in the ‘AI Utilization Guidelines’ proposed by the Japanese government (Ministry of Internal Affairs and Communications 2019).

3.2 Data Management

To ensure the reliability of AI services, it is imperative for users and service providers to discuss whether data management is appropriate. Some users may ask service providers to explain the security measures they have employed while learning how they manage data, while others may trust the company and refrain from checking the security measures.

Therefore, service providers are expected to indicate what data management they adopt. While there are various certifications and standards for data management, some companies have adopted their own standards, and management methods are diversifying.

3.3 Purpose of Use

When users are deciding whether or not to use an AI service, it is important for them to know what information will be taken from the service and for what purpose it will be used. The general means of communicating the purpose of use is the terms of service agreements.

To avoid too-long and too-complex terms of service agreements that go unread (Maronick 2014), service providers should provide agreements that are easy to understand and of appropriate length. Currently, there are formats allowing users to choose what they agree or disagree with, rather than just agreeing or disagreeing with all of them at once. In addition, some terms are designed so that there are no negative consequences, such as services being provided even if users do not agree with all of them. The way the terms of use are written and presented is also becoming more diverse.

4 Verification of Trustworthy AI Model: A Case Study of AI for Dietary Habit Improvement Recommendations

There are various types of recommendation AI services, but this paper takes recommendation AI for dietary habit improvement utilizing users’ dietary data as the case study. As Japan’s Keidanren (Japan Business Federation) has proposed a vision of what health management should be in Society 5.0, there is a high affinity and need for AI and healthcare services. Applications in which AI provides menu and nutrition management support and interactive AI that analyses meal menus in real time and guides people towards appropriate eating habits are available in Japan.

A survey was conducted from January to March 2021 to examine what kind of AI services, data management methods and purpose of use would make users willing to use AI for dietary habit improvement recommendations.

4.1 Subjects

A research firm was commissioned to select respondents to ensure that Japanese men and women and their ages (20–60s) were equally represented, and in-depth interviews were first conducted with nine of them. For the subsequent survey, the same group of respondents was commissioned to a research firm, and 500 respondents were included. Since the purpose of the survey was to target general users, those who responded in the pre-survey that they were ‘familiar with AI’ and ‘not at all reluctant to use AI services’ were excluded from the survey.

4.2 Verification 1: AI Intervention

In-depth interviews were conducted to determine the degree of AI intervention in recommending services. It was found that respondents tend to prefer services in which ‘AI can be used, but the final recommendation is made by a human’ rather than ‘AI making the recommendation’ for dietary improvement advice. Two trends were obtained as reasons for this: AI performance is considered to be not as good as that of humans, and the ability to ask questions and have conversations is considered important in healthcare.

Therefore, four patterns were displayed and tested in the quantitative survey as demonstrated in Table 10.1: a dietitian with AI (No. 1) and AI (No. 2); a dietitian with AI (No. 1) and a high-performance AI that makes suggestions equivalent to those of a dietitian (No. 3); a dietitian with no questions or conversations (No. 4) and AI (No. 2); and a dietitian with no questions or conversations (No. 4) and a high-performance AI (No. 3). Respondents were asked to indicate which of these four patterns of food improvement advice they would prefer to use.

Table 10.1 AI intervention descriptions

4.3 Verification 2: Data Management

When we conducted an in-depth interview to determine how users evaluate whether data management is properly implemented, we received several comments indicating that there is an emphasis on whether some kind of certification is obtained rather than a detailed explanation of what kind of technology is used to ensure security measures.

Therefore, we displayed three patterns of questions in the survey, as Table 10.2 shows: a comparison between ISOFootnote 2 and a company’s own standards as a representative example of certification, and an explanation of those certifications and specific security technologies to verify which service users would prefer.

Table 10.2 Data management option descriptions

4.4 Verification 3: Purpose of Use

In-depth interviews were conducted to investigate what type of explanation of the purpose of use is preferred in the terms of service agreements for AI services and how best to obtain consent. From the interviews, ‘the items for obtaining consent are explained in detail’ and ‘users can customize what they agree or do not agree to’ tended to be preferred.

Therefore, in the survey, we created terms of service agreements (Table 10.3) and three patterns of questions were displayed: users agree collectively by checking one item; users check five items per category, and users check all (15 items), to verify whether users’ preferences change with the number of checked items. In the cases of checking five and 15 items, we also created a pattern with optional check items and compared it with a pattern in which check items are required.

Table 10.3 Description of purpose of use in terms of service agreements

4.5 Method

The quantitative survey format was created using the maze web tool,Footnote 3 and the two patterns were displayed simultaneously on the left and right sides of the screen, with the different patterns highlighted. The respondents were asked to choose which service they would prefer to use (Figs. 10.1 and 10.2).

Fig. 10.1
2 diagrams represent the screens of 2 different A I intervention. They include A I - based diet improvement service, and cutting-edge security technology in common, except the best diet improvement methods from the professional dietician in A, and A I in B, respectively.

Example of screens with different ‘AI intervention’

Fig. 10.2
2 templates present the screens of agreement regarding personal data to be collected. Both the templates include the data on what type of data to be collected, and the purpose of use. Template A has 2 mandatory and 3 optional check items, while B has 15 mandatory check items.

Example of screens with different ‘terms of service agreements’. (A has five check items with customization, B has 15 check items with no customization)

4.6 Results

4.6.1 AI Intervention

Among the recommended services, Fig. 10.3 shows the results of user preferences for the pattern in which a person or AI ultimately recommends the service, the pattern in which AI is as high performing as a person, and conversely, the pattern in which a user cannot talk to a dietitian (human). The figure shows that the strongest preference is for services that are ultimately recommended by a dietitian (human) using AI. In addition, when a dietitian (human) is compared with a high-performance AI that can make judgements similar to a human, the human recommendation nevertheless prevails. Conversely, the ratio of those who prefer to use the AI recommendation increases, reversing the result for the recommendation by a dietitian (human) who cannot ask questions. These results suggest that users emphasize the interactivity of the recommendation service, such as the ability to have a conversation, rather than the high-performance AI. This tendency was more pronounced among the group that responded that they are ‘resistant to AI services’ (Fig. 10.4).

Fig. 10.3
A horizontal bar graph. Following are the dietitian and A I values, respectively. Dietitian or A I. 74.6, 25.4. Dietitian or high-performance A I. 58.8, 41.2. Dietitian unable to ask questions or A I. 40.2, 59.8. Dietitian unable to ask questions or high performance A I. 27.2, 72.8.

Preference for AI intervention (n = 500)

Fig. 10.4
A horizontal bar graph. Following are the dietitian and A I values, respectively. Dietitian or A I. 81, 19. Dietitian or high-performance A I. 79, 21. Dietitian unable to ask questions or A I. 49, 51. Dietitian unable to ask questions or high performance A I. 40, 60.

AI service preference of those with AI service resistance (n = 500)

4.6.2 Data Management

What do users tend to look for in a recommendation AI to determine if the data is being appropriately managed? Figure 10.5 shows the results of user preference for patterns in which ISO or a company’s in-house standards are clearly stated and in which there is an explanation of specific security technology. The figure suggests that some kind of certification, such as ISO or acquisition of a company’s own standards, is more important for gaining user trust than an explanation of specific security technologies.

Fig. 10.5
2 horizontal bar graphs. Following are the standard and technical description values, respectively. 1, I S O or technical description. 75.4, 24.6. In-house standards or technical description. 63.8, 36.2. 2, I S O or in-house standards with I S O value of 68.6, and in-house standards of 31.4.

Preferences of data management (n = 500)

Compared to in-house standards, users tend to prefer ISO. This suggests that socially recognized certification is more reliable. In this survey, it was set as ‘major well-known Japanese companies’ that established their own in-house standards. Contrary to the assumption that the standard set by a major well-known company would be considered more reliable, users preferred ISO, suggesting that it is less significant to display one’s own standards, regardless of company size or name recognition. Contrarily, the results of the study suggest that companies would be more likely to earn socially recognized certification such as ISO, which would contribute to user preference.

4.6.3 Purpose of Use in Terms of Service Agreements

Which consent formats are preferred by users? Figure 10.6 shows the results of user preferences, which are classified into patterns based on the number of consent check items and customizability. The figure shows that users prefer the one with a higher number of consent check items. This suggests that a large number of items is preferred, even though it is troublesome to read through them.

Fig. 10.6
2 horizontal bar graphs. 1, the many and fewer checks values are, 5 items or 1 item. 59.4, 40.6. 15 items or 1 item. 63.4, 36.6. 15 items or 5 items. 59.6, 40.4. 2, the customized and non-customized values are, 5 items or 15 items. 55.6, 44.4. 15 items or 5 items. 59.4, 40.6.

Preferences of terms of use (n = 500)

Meanwhile, users preferred the customizable pattern with ‘optional’ check items in addition to the ‘required’ ones, regardless of the consent items. This suggests that users prefer patterns that allow them to choose the contents of consent by themselves. However, in this survey, users do not actually click the mouse to check the item. The results of this survey should be verified further, as the cumbersomeness of the operation may prevail if users are actually required to click to check the boxes.

5 Necessary Elements for Trusted AI

Technical developments such as explainable and fair AI are considered important for trusted AI services. This paper focuses on communication between users and service providers among non-technical aspects. To facilitate communication, we investigated what kind of AI intervention, data management and purpose of use would be considered acceptable by users for dietary habit improvement recommendation AI.

Reliability of services in general depends on users understanding an accurate description of how they work. However, AI is subject to changes in the environment in which it is used and in the algorithms that result from the learning process. Considering that predictability of behavior is difficult not only for users but also for service developers, transparency and accountability are important for users to trust AI services. Specifically, communication is required so that there is no discrepancy in understanding between users and service providers regarding AI intervention and data management.

The results of this survey suggest that people tend to trust recommendation AI that is communicative, allowing users to ask questions and engage in dialogue when it is recommended. For those who are resistant to AI, it is also expected that rather than promoting the technical perspective of ‘being a high-performance AI’, it would be more acceptable to develop a service that allows users to talk and ask questions to people while using AI in the background for analysis. Advertising high-performance AI that can provide advice equivalent to that of a human being or providing detailed information about the technology used for data management is not likely to have a significant effect on user preference.

In addition, the pattern of ISO for data management was highly rated. However, when the respondents who preferred the pattern with ISO in the in-depth interview were asked if they could explain what ISO is, few had extensive knowledge except that it is a standard of some type. Furthermore, the results showed that the more detailed the items for obtaining agreement to the terms of service agreements, the better, but other studies show that people don’t read terms of agreements (F-secure 2014; Japan Fair Trade Commission 2021). Therefore, this suggests that the detailed content itself is a sign of trust. These results also indicate that non-technical aspects, such as socially formed understandings, have a greater influence on user preferences than the actual existence of technical guarantees and the fact that they are explained. This study suggests that service providers should not ignore these non-technical perspectives when it comes to trusted AI.

This is a case study of AI recommendation for dietary improvement services. However, AI recommendation services are employed in various fields, and not all recommendation AIs are similar to those in this survey. It can be assumed that health management and medical-related fields are areas where interaction with people is particularly important. For example, users may prefer recommendation AI rather than people for complex route guidance because AI can better perform complex processing. Therefore, an international trustworthiness survey focusing on AI intervention, data management, and purpose of use in various service cases is needed to consider the requirements for trustworthy AI.