1 Introduction

The design and implementation of artificial intelligence (AI) systems is rapidly growing. Still, the AI alignment and control problems—i.e., how to build the systems in ways that will aid rather than harm humans—have remained unresolved. This article strives to contribute to these discussions on the lack of coherence and common understandings regarding AI values through original instruments for empirically testing the human evaluations of the often-conflicting principles of designing, using, and implementing AI systems.

The academic community has widely discussed the need to formulate, test, and adapt ethical guidelines, principles, and values for developing AI. Some of these discussions have addressed the elements and effectiveness of AI-related ethical principles and values in general (for an overview, see Hagendorff 2020). Other studies have discussed increasing conflicts in AI values, in particular tensions between expanding the need for the explainability of AI while ensuring environmental sustainability in the context of the growing energy footprint of data processing (König et al. 2022). The need for value negotiations is obvious when data or machine learning algorithms are transferred from one society to another (Masso et al. 2022), and cross-border data sharing is projected to grow significantly in connection with the spread of AI solutions globally. Therefore, there is no shortage of ethical guidelines globally. However, there is a lack of knowledge on why AI developers still face social dilemmas in AI ethics (Strümke et al. 2021), which restricts the widespread adaptation of ethical best practices. There is also a lack of empirical knowledge about the embeddedness of the values in artificial artefacts and diverse social contexts (Helberger et al. 2020), as seen by humans being potentially related as targets or developers in these solutions and the conceptualisation of values related to AI.

Current discussions on AI’s social and political consequences primarily focus on ethical standards as a solution to emerging challenges, as expressed in soft law initiatives, including ethical directives, guidelines, and certificates (Hagendorff 2020) or public AI strategies (Toll et al. 2020). However, there are difficulties in operationalising these moral demands into practice when developing artificial artefacts. For instance, developers have to manage personal values with the company’s goals, which may not overlap with the values that citizens want AI systems to strive towards. Therefore, values related to AI systems are dynamic rather than static, with changing and differing understandings across domains, sectors, countries, and individuals. Nevertheless, we still need to learn about changing values at these various levels that emerge with the development and implementation of AI.

The body of studies on AI and values is growing steadily, and this research from a normative perspective foremost discusses which shared values the developed solutions should correspond to (see, e.g., Fatima et al. 2022; Züger and Asghary 2022; Wang 2022; Wirtz and Müller 2019). These discussions have mapped the public values related to digital transformations in the general (see e.g., Wang 2022; Bannister and Connolly 2014) or public value creation related to AI applications in public administration in particular (see e.g., Züger and Asghary 2022; Wirtz and Müller 2019). However, we still do not have any measurable instruments for empirically measuring how citizens understand the AI values formulated in the normative guidelines as value ideals.

In addition, there have been extensive academic discussions on human values and their emergence, evolution and dissolution, including theoretical and empirical research (among others, see, e.g., Hanel et al. 2018; Cieciuch et al. 2014; Davidov et al. 2014; Rokeach 2008; Inglehart et al. 1998; Schwartz 1994). Despite the increasing interest in measuring human values embedded in artificial artefacts (Han et al. 2022; Kasirzadeh and Gabriel 2022; Umbrello 2022), there are still no instruments that would enable measuring and explaining such value understandings. Therefore, there are still gaps and an urgent need for empirical studies and conceptualisation of values in AI, especially from a citizen perspective that moves beyond high-level standards and guidelines (van de Poel 2020) or public administration ideals and public values (Masso and Männiste 2020; Bannister and Connolly 2014; Rose et al. 2015). Seeing how citizens across different social groups and countries prioritise values is also essential because there is an urgent need to better understand AI’s potential uses in tailoring the human-centric approach (Lomborg et al. 2023).

Few previous studies and initiatives have taken an explicit citizen-focused perspective, namely how citizens experience the increased implementation of AI, particularly in public administration and welfare provision contexts. However, a few exceptions exist, such as Ingrams et al. (2022), Gesk and Leyer (2022), in which citizens’ preferences for governmental use of AI are examined. A focus on which values these AI systems should embed, or if and how the fundamental values are potentially understood across individuals, countries, domains, or systems, needs to be included. In this article, we implement a comparative approach to studying diverging values in Estonia, Germany, and Sweden, which differ in their use of AI solutions in the public sector (Charles 2009). Furthermore, we take a human-focused perspective to describe, map, and explain the importance of AI-related value estimates. This study proposes original survey instruments to measure citizens' understanding, agreement, and disagreement with the value of AI as articulated in moral guidelines (Hagendorff 2020). Applying the principle of exploratory factor analysis, this research strives to find the potential underlying dimensions of the values as seen by the citizens based on the correlations of the original constructed value items. Although applying an exploratory factor analysis approach, the theory-building ambition is beyond the scope of this empirical article. However, we hope that the proposed value items and conducted analysis contribute to the discussions on the conceptual framework of basic values in AI by indicating potential future directions for measuring, validating and elaborating on AI values.

2 Literature review

2.1 From value beliefs to embedded values

Although there is a wide array of research on human values in general (Inglehart 2018), values in AI systems (Ryan et al. 2022; van de Poel 2020), and public value creation in public administration in particular (see, e.g., Fatima et al. 2022; Züger and Asghary 2022; Wang 2022; Wirtz and Müller 2019; Janssen and Kuk 2016), values are mainly considered notoriously hard to define.

In this study, we take as a starting point that ‘value’ can be used both as a noun and a verb, as both a thing and an activity, and thus value can be attributed to an object and can be actively produced or experienced (Bolin 2011). Value beliefs are, therefore, always socially produced and relational, meaning that these are created and expressed concerning a particular object, like AI systems. However, based on the theories and empirical studies of human values (Hanel et al. 2018; Cieciuch et al. 2014; Davidov et al. 2014; Rokeach 2008; Inglehart et al. 1998; Schwartz 1994), we assume that people’s understanding of data technologies does not only form attitudes, beliefs, and expressions of trust or fears regarding the emergence of AI. Instead, people’s perceptions of publicly formulated value principles with regard to AI might also constitute values that motivate people to act in a specific direction (e.g., to design, use, and interact with artificial artefacts). Furthermore, there have been suggestions to distinguish between first-order values, referring to fundamental rights and standards, and second-order values, including more abstract concerns (Viscusi et al. 2020). For example, Nick Couldry has argued (Couldry 2010: p. 2) that having a voice is a second-order value about values or ‘a reflexive concern with the conditions for voice as a process’.

In the context of AI and emerging digital technologies, several attempts have been made to define central values, mainly in the form of ethical guidelines that aim to safeguard public values. For example, Van de Poel Field (2020) refers to the EU High-Level Expert Group on AI and lists respect for human autonomy, prevention of harm, fairness, and explicability as central values that should be embedded in AI. Furthermore, the IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems (The IEEE Global Initiative 2022) has added respect for human rights, well-being, data agency, transparency, and accountability. An analysis of AI strategies in the Nordic countries (Robinson 2020) has identified trust, transparency, openness, autonomy, privacy, and democracy as reappearing values that should be considered when implementing AI.

However, the fundamental question remains: how are these values interpreted and embedded explicitly in emerging technologies? Van de Poel suggests a model (van de Poel 2020) that moves from intended values to embedded and realised values to explain the value creation, expression, and implementation process in AI systems. He argues that designers of AI systems are responsible for embedding values into technologies that, in turn, should realise those values. Nevertheless, his model ends there and remains to a certain extent, unclear about how users and citizens evaluate the realisation of value. At each step, we also need to consider the perspective of those who are exposed to and act in conjunction with intelligent and autonomous systems.

Empirical studies on the embeddedness of the values in digital technologies have addressed how the values are embedded within specific AI technologies (van de Poel 2020). Other studies have mainly addressed the values that emerge or change with the introduction of AI within specific use areas such as public administration (Janssen and Kuk 2016). These empirical studies have systematically mapped the AI values within strategic documents (Toll et al. 2020) and their efforts to classify the suggested AI guidelines (Dexe and Franke 2020). Research has also systematically examined the AI values seen in policy areas (Valle-Cruz et al. 2019) and which public values are most referenced in AI strategies (Viscusi et al. 2020). This research also has strived to use the categories of public value ideals found in the context of general digital transformations, like values of professionalism, efficiency, service and engagement (Rose et al. 2015) or values oriented towards duty, service, or broader social goals (Bannister and Connolly 2014; and adapt and test these in the context of driving public values of AI initiatives in governments (Masso and Männiste 2020). Although research has also suggested systematically examining how different actors express and are affected by the implementation of AI and how they evaluate such technological changes (ibid), as far as we know, there are no comparative studies in this field.

Examining the citizen’s perspectives on what the strategies emphasise and what governments plan, as reflected in the ethical guidelines, would contribute to a better understanding of reflexive concerns about AI development as a process. A better understanding of the citizens’ perspectives of the values, as inscribed in the ethical guidelines, would also contribute to preventing the potential negative behavioural impacts, like harms and risks, when designing, using, and implementing AI systems.

2.2 Values embedded in AI systems

Values are embodied in AI systems (van de Poel 2020) due to design activities intended to embed those values in such systems. Because artificial artefacts are created to serve human purposes (Fatima et al. 2022), these design activities are based on the values prevailing in societies. Therefore, these values are often publicly negotiated and agreed upon by formulating the ethical principles, guidelines, and criteria as an indicative framework recommended for the design and use of AI systems (for an overview, see e.g., Hagendorff 2020). Based on that, we assume that the values formulated in these guiding principles are not only publicly constructed and negotiated, but are also internalised and can be expressed by people.

Initial discussions on AI and values have sought to find a common understanding of the potential ethical principles and guidelines for developing, using, and implementing AI. Values embedded within technologies are visible in specific guidelines and standards developed for AI (Hagendorff 2020). In these documents, a strong emphasis has been put on technical values such as privacy and integrity as well as transparency and efficiency (Toll et al. 2020), which are necessary for providing public and private services. More recently, discussions on values such as interoperability—assuring technical transferability of algorithms from one system to another through seamless services—have also emerged (Wimmer et al. 2018). These technical values are increasingly connected to more recently emerging values of explainability that should safeguard and allow the interpretability of the machine learning algorithm and its output by a human in an acceptable way (König et al. 2022) (Table 1).

The second strand of research engages with the broader implications of using AI in different sectors of society. These studies engage, for example, with the role of human agency in AI and the implications of this for social inequality (Eubanks 2018), anti-discrimination (Lyon 2005), and justice (Dencik et al. 2016). The attention to the human and social aspects of AI systems has grown in the context of increased interest in transferring AI systems from one society to another, in addition to the technical adaptability criteria. Also, the need to ensure the social adaptability of AI systems across social contexts is increasingly discussed (Masso et al. 2022) because the development of AI systems is very context-specific. Studies within this strand of research also engage with the motivations and legitimation of introducing AI technologies. For example, they have pointed to the value of facts and evidence-based decision-making (Misuraca et al. 2012) and the value of efficiency and innovation (Lowrie 2017) on one hand, and the value of care (Kasapoglu et al. 2021) and social welfare (Dencik and Kaun 2020) as potential motivating factors for designing AI systems on the other.

The third body of research concerning values and AI focuses on how actors relate to and evaluate the implementation of AI in different areas. For example, Ranerup and Henriksen (2019) have explored how civil servants experience the implementation of Robotic Process Automation for fully automated decision-making in social services and, in particular, how the question of the forms and extent of discretion are changing. In contrast, Helberger et al. (2020) have explored public attitudes and expectations related to automated decision-making among citizens in the Netherlands. In connection with the introduction of AI and algorithmic systems in public administration, values are also assumed to potentially change (Jørgensen and Bozeman 2007). However, this change is relatively inert due to the interrelatedness, causality, and hierarchy of value developments across various interrelated parties, including society, citizens, public administration, politicians, etc.

In sum, the discussions on values embedded in AI systems have often focused on questions of transparency and explainability to open the black box of AI. Although further theorisation of basic values in AI is suggested in prior studies (van de Poel 2020), a further empirical examination is needed to better understand the importance of the values for citizens and the implicit expressions of the values visible in the latent associations of those values.

3 Data and method

3.1 Sample structure

This article is based on a representative online survey methodology to examine how the population groups see the values of AI. The analysis is based on a survey conducted in Estonia, Germany, and Sweden between 18 October and 9 November 2021.

The questionnaire was developed by the authors of this article and sent to a representative sample of the population in Germany, Estonia, and Sweden. The first factor for the choice of countries studied was based on central welfare-state regimes and the representation of post-socialist and ‘old’ democracies. Following and extending the typology of welfare state regimes (Esping-Andersen 1990), the chosen countries represent a social-democratic welfare state model (Sweden), a corporatist-statist welfare state model (Germany), and a post-socialist welfare state model (Estonia). The three countries included in the study also differ regarding the degree of using AI and automated decision-making in the public sector. While Estonia and Sweden are expected to have a high level of digitalisation and use of AI in general (Charles 2009), Germany is less advanced in the process. However, Germany has expressed high ambitions in digitalisation for the future. At the same time, Germany has a different relationship to questions of privacy and using AI, namely having longer traditions of public discussions about social norms and values about using population data in the public sector (Schmidt and Weichert 2012).

The fieldwork was conducted with opinion and social research company Kantar Sifo,Footnote 1 who works with online population panels. The Sifo online panels consist of members recruited through nationally representative surveys.Footnote 2 For this study, the sample was formed based on the principle of territorial representation so that the study invitation was sent to the online panel members who met the predetermined criteria (age, gender, place of residence). The survey was conducted among the population aged 18–75. Specifically, the survey was sent out to 10,118 persons in Estonia, 12,506 persons in Germany, and 6083 persons in Sweden (see Table 1). The response rate was 14.8 per cent in Estonia (N = 1500), 16 per cent in Germany (N = 2001), and 16.4 per cent in Sweden (N = 1000). The Estonian sample was extended to 1500 by the addition of 500 further participants to the original 1000 respondents. This expansion of our sample was undertaken to secure representativeness across Estonian and Russian speakers. The reported response percentages represent the overall response rates observed in online panels (Pedersen and Nielsen 2016) when compared to face-to-face surveys (Szolnoki and Hoffmann 2013), as well as in comparison to studies that relied on paid crowdsourcing respondents (Eklund et al. 2019). The average time to complete the web survey was 15 min.

Table 1 Sample structure of the survey (%)

3.2 Survey methodology

We assume in this article that the values formulated in the guiding principles of the design and use of AI (for an overview, see, e.g., Hagendorff 2020) are not only publicly negotiated and individually internalised but are also operationalisable and measurable in the empirical research among citizens, who are potential end-users of these AI solutions.

Based on a prior study systematising earlier guidelines and principles on the moral values related to the AI (Hagendorff 2020) and on prior studies (Masso and Kasapoglu 2020, Hodapp and Hanelt 2022; Trauttmansdorff 2022; Dencik and Kaun 2020; Leese 2020; Ranerup and Henriksen 2019; Taylor and Purtova 2019; Eubanks 2018; Lowrie 2017; Taylor 2017; O’Neil 2016; Wilmott 2016; Rose et al. 2015; Hellberg and Grönlung 2013; Misuraca et al. 2012), we extracted 15 value items. The values included in this study are the following: efficiency, privacy, diversity, justice, equality, accountability, transparency, security, welfare, sustainability, monitoring, solidarity, explainability, autonomy, and interoperability. Most of these values have been included in the AI guidelines and principles (e.g., efficiency, privacy, justice, autonomy, diversity, accountability, security, transparency, explainability, monitoring, and solidarity). To balance the list of values, we added items less often mentioned in the guidelines on the moral values related to AI (e.g., welfare, sustainability) but emphasised in prior academic research (e.g., Dencik and Kaun 2020; Thylstrup et al. 2022). Therefore, we included in the survey the values mentioned more often and less often in the AI guidelines, as it better corresponds to the diversity of values among citizens belonging to different social groups and countries. Besides, to make the value items more understandable for the general population, we complemented the value names with short explanations (e.g., efficiency—saving costs, time, and similar resources), as suggested in prior research on human values (Rokeach 2008). A more detailed explanation of the choice of value items, based on the prior research is presented in Table 2.

Table 2 Value items used in this study based on AI ethics guidelines and prior research*

The developed value items were presented to the respondents and the next question was formulated: Below we present a list of values and principles that some public and private institutions consider very important in the development and use of solutions related to data analysis, algorithms and artificial intelligence,Footnote 3 while others consider them completely irrelevant. How important or unimportant they are to you? We asked the respondents to evaluate the importance of presented values on the next 5-point scale: 1—Completely insignificant, 2—Rather insignificant, 3—Neither important nor insignificant, 4—Rather important, 5—Very important. The selection of the 5-point scales employed in this study draws upon the initial Milton Rokeach (1973) values scale, which was subsequently refined by Shalom Schwarz (1992). Considering the successful application and validation of this scale in previous studies examining value shifts in Estonia amidst rapid digital transformations (Kalmus et al. 2020; Vihalemm et al. 2017), we have employed this well-established scale in this study.

The surveys were conducted in the local languages, and the translation validity of the questions and scales was assured through two-way translation, multiple testing, and close reading of the study instruments by participants in the three countries. The average time to complete the web survey was 15 min. Prior to data collection, ethical approval was sought and granted by the Swedish Ethical Review Authority (Approval No. 03663-01), certifying that the study design and procedures align with ethical considerations.

3.3 Data analysis techniques

We used descriptive analysis techniques to reveal the general tendencies in value evaluations, including arithmetic differences and standard deviation calculations. Additionally, analysis of variance (ANOVA) was used to estimate the statistical differences in the responses across the three countries. To reveal the unknown latent structures in the value estimates, we used principal-component factor analysis.Footnote 4 We additionally applied the varimax rotation technique to obtain more interpretable factors.Footnote 5 However, the analysis aimed to explore the underlying dimensions of the AI values that the applied factor analysis enables; we also sought to explain the variations in those values as seen by the individuals and, therefore, to contribute to the discussions on the conceptual framework of AI values. To do so, we combined the interpretational approach with the statistical criteria when choosing the factors. Based on the former, we chose the number of factors and final factor structure based on the Cattell scree-test (Cattell 1966) and the Kaiser eigenvalue criteria (Kaiser 1960). Based on the latter, in the initial analysis, we compared the factor solutions with different numbers of the factors. In addition, we also tested the factor solutions across diverse socio-demographic groups with varying factors to identify the most stable and explainable factor structure.

We evaluated the reliability of the factor solution and calculated Cronbach alpha coefficients across the variables constituting each factor. We calculated individual factor scores to compare the identified value dimensions and analyzed the mean scores across countries, socio-demographic groups, and background variables. We analyzed associations between the value estimations and the other items included in the questionnaire on the implementation of AI in diverse domains of life and socio-demographic variables. We explored the relationships between value dimensions and background variables using ANOVA. In addition, we also conducted multi-dimensional scaling (MDS) analysis to test the differentiation of the value variables based on the correlations between the variables. MDS also enables finding the underlying latent dimensions that might explain how individuals see and value artificial artefacts. In addition, MDS combined with factor analysis allows us to further discuss and contribute to the prior theoretical research on human values.

4 Results

4.1 General tendencies in value estimates

In general, all 15 value items presented to the respondents were regarded as more or less significant by the respondents (average of mean values > 4.16 on the scale where 4 is rather important and 5 is very important; see Table 3). We suggest that value criteria formulated by the AI experts in ethical guidelines and regulations are accepted by the citizens who are the targets of the design and implementation of AI systems.

Table 3 Descriptive statistics of the AI values (mean, standard deviation)

The analysis also revealed variations across the listed 15 value items. The analysis results indicate that the respondents considered six values as more significant, including security, accountability, privacy, equality, justice, and transparency. These have also been the values where the variations across the individuals  have been the lowest (see standard deviations < 0.1 in Table 3, standard deviations being indicated in italic). People tend to agree regarding the relative importance of these values when AI systems are developed. These values have been more often included in the AI ethics guidelines (Hagendorff 2020) and have gained more public attention.

Similarly, relatively high importance has been assigned to values like explainability, welfare, autonomy, and sustainability, which were also characterized by relatively low heterogeneity in the responses. The lowest importance and highest variation among respondents (standard deviation around 0.1) were ascribed to solidarity, diversity, interoperability, and efficiency. The relatively lowest value estimates but highest standard deviations could be interpreted here by the fact that these values are less often included in the AI ethics guidelines (Hagendorff 2020) and are also less discussed in public and in common agreement, but also that there is a lack of understanding regarding these values. Regarding values of solidarity and diversity, the differences when comparing the three countries are also the smallest, which could be interpreted as their embeddedness in the general understanding of human rights (Taylor 2017).

The lowest estimates and the most controversial responses were given for the value of monitoring, i.e., the effective control of human behaviour. The lowest importance of this value may indicate an awareness and public discussion of the negative outcomes of using AI solutions for controlling human behaviour. The highest standard deviations indicate that the respondents may understand this differently, i.e., it may indicate both controlling of human behaviour by the government and the exercise of power. However, it could also be understood as monitoring the decision-makers who use AI as a technological aid for more evidence-based decisions, sometimes with potentially biased results.

4.2 Value dimensions of AI

We used principal component factor analysis to examine the complex interrelationships between the values. The analysis revealed that the four-factor solution was statistically the most robust and best explained the underlying AI value orientations. The analysis is visualized in Fig. 1 where the loading strength of the factor is colour coded for factor 1 in blue, factor 2 in red, factor 3 in green and factor 4 in purple. The strength of correlations is illustrated across the value items through a sliding colour scale.

Fig. 1
figure 1

Correlation analysis of the AI values (Spearman, left) and factor loadings across variables (right). F1—protection of personal interests, F2—universal solidarity, F3—diversity and sustainability, F4—efficiency

We named the first factor protection of personal interests which refers to the protection of personal interests for assuring social benefit when designing and implementing AI based on the core and residual variables that constitute this factor. This factor included the single value items related to protecting personal rights related to AI development, like privacy, accountability, security, explainability, and autonomy. Besides these core variables, high factor loadings (> 0.4) related the variables to aspects of human rights and social well-being—such as justice, equality, and welfare—that constitute the additional components in this factor. Although this value dimension foremost characterizes the desire to protect humans, this protection is associated with the need to assure human autonomy, sovereignty, individual privacy, broader social categories, and the individual’s relationship to society. Because this factor contains the greatest number of single items (n = 6), it also explains the greatest amount (28% per cent) of the total variability of the factors.

The second factor was named universal solidarity and referred to an orientation towards general monitoring for assuring universal solidarity through designing and implementing AI systems based on the variables included in this factor. This factor consists of four variables related to broader social aspects of AI design, development, and implementation: monitoring, interoperability, solidarity, and welfare. Because this factor only includes these four core items, and no additional variables have high loading in more than one factor, the structure of this factor is more stable and robust compared to the first and third factors. This factor explains the need for monitoring and effective control of human behaviour on the one hand and the need to ensure the adaptability, transferability, and universal usability of AI systems and the need to involve people and ensure social cohesion on the other. The second factor explains 17% of the total variability.

The third factor revealed in the analysis was named social diversity and sustainability referring to the value orientation towards AI design and development, considering social diversity and ensuring social sustainability. This factor consists of core variables characterizing diverse aspects of social diversity, justice, equality, and sustainability. The additional variables also having high factor loading in this factor include privacy, welfare, and solidarity. Because the factor loadings of these additional variables were considerably lower than in the case of the first factor, these variables were not considered when giving a name to this factor. Unlike the first factor, which emphasizes the aspects of social benefits through personal data protection, this factor characterizes more universal values regarding ethnicity, gender, lifestyle, social group belonging, environmental protection, and waste reduction (e.g., including environmental impacts related to data storage). The third factor explains 15% of the total variability.

The fourth factor extracted in the analysis was named efficiency based on the one variable constituting this factor. Therefore, efficiency, in terms of costs, time, and similar resources, is perceived as a single dimension in the overall understanding of which norms and principles AI developments should rely on based on respondents’ views in the three countries. It is noteworthy that this variable was unidimensional and thus did not have any high loading in factors other than this one. The factor explains 7% of the total variability.

In sum, the value dimensions that emerged in the present analysis were expressed by the citizens as a response to the normative values written in the AI guidelines and are reflected by the four main dimensions. These single value beliefs that emerged in broader dimensions or orientations might represent humans’ basic values regarding AI solutions. However, based on the prior literature on value estimates (Ryan et al. 2022; Inglehart 2018), we also assume that the basic value orientations are dynamic rather than static phenomena and thus may vary in time and space and across socio-demographic groups. In the next section, to validate these value dimensions, we evaluate these emergent factors with additional analyses of associations.

4.3 Explanation of values and value dimensions

To explain the values and identified factor structure, we tested the associations of the value dimensions across countries and with socio-demographic and other background variables.

This study has revealed the list of values that are robust across cultures and, as such, can help explain potential conflicts in values when designing and implementing AI systems (please see the differences in mean values across countries in Table 3). The analysis revealed that the average estimates for all the values were in general the highest and most homogeneous in Estonia (mean = 4.17, standard deviation = 0.83), followed by Sweden (mean = 3.97, standard deviation = 0.92) and Germany (mean = 3.73, standard deviation = 3.97). The highest valuations among the Estonian population may be interpreted by the important role of digital technologies, including AI, in the national branding (Masso and Männiste 2020), and thus they are more often publicly spread and accepted by individuals. In contrast, the lowest estimates in Germany could be explained by the historically long discussions in the public sphere about the role of personal data in the public administration (Schmidt and Weichert 2012), which potentially has also been expressed in the discussions and cautiousness in implementing AI systems.

However, the analyses revealed, that the national differences were not visible in regard to all values. Instead, the variations across the analyzed three countries were the biggest regarding values of security, i.e., reducing people's insecurity and ensuring a sense of security, and accountability, i.e., responsibility for the use of data and its possible consequences. Regarding both values, respondents from Estonia rated the importance of these values highest, followed by Sweden and then Germany.

The analysis also revealed significant differences in the revealed four-factor structure across countries. The third factor, reflecting the values of social diversity and sustainability, was rather universally spread across the three countries based on the non-significant association F coefficient (p > 0.001, Table 4). The biggest differences in value estimates across the three countries were seen in the case of the first factor on the protection of personal interests for assuring social benefit, where Estonia stood out with high positive values compared to the lower positive values of Sweden and the negative values of Germany. Similarly, the value dimensions of the second and fourth factors characterizing monitoring and ensuring universal solidarity and efficiency had high factor scores in Estonia compared to Sweden and Germany, with negative or close to zero values.

Table 4 Explanation of AI value dimensions across background variables (ANOVA)

Besides differences across countries, the analysis of the associations with socio-demographic background also indicated statistically significant differences. The second factor on values on general monitoring for assuring universal solidarity was more universally spread (i.e., the F test was statistically insignificant) across socio-demographic variables like gender, age, and education. Other values were statistically significantly distinguished by socio-demographic variables, where the first (protection of personal interests for ensuring social benefit) and fourth (efficiency) factors were more highly evaluated among those with higher education (in the first factor, higher values were also among the older generation, 59–75 years old). The third factor (considering social diversity and sustainability) was more highly evaluated among female respondents.

We also analyzed the associations of the factor variables with the agreement with automation in different domains. Factor three, evaluating social diversity and ensuring social sustainability, presumably explains more universal values independent of automation in particular fields, as indicated by the non-significant associations. Those individuals expressing higher evaluations regarding monitoring and ensuring social solidarity (factor two) and efficiency (factor four) agreed less with automation of the labour, social care, and policing domains. Similar associations were visible with factor one, but the direction of the association was the opposite. Those respondents evaluating the protection of personal interests and social benefits regarding automation expressed higher agreement with automation in social care and policing (except for agreement with automation in labour, where the association was statistically non-significant).

4.4 Validation of value dimensions

We conducted several statistical tests to estimate the factor solution's validity and reliability. Following the Kaiser statistical criteria, we included the factors with eigenvalues > 1, and we found that the four-factor solution met this criterion (in the case of the first and second factors) or close to it (0.942 in the case of the third factor and 0.905 in the case of the fourth factor). The result also indicated high statistical reliability of the four-factor solution and high internal consistency because all factors had Cronbach alpha coefficients > 0.7 (F1 α = 0.873, F2 α = 0.734, F3 α = 0.805).

In addition, to evaluate the validity of the identified factor solution, we evaluated the potential shifts and stability of the value items across factors by comparing the change in factor composition with a diverse number of factors (Fig. 2). This analysis of changes across factors is visualized in Fig. 2 which shows the results from a principal component analysis of changes in AI value estimates namely the composition of factors in the analysis with factor numbers 2 to 5 and factor loadings. The streams indicate the robustness of values across the identified factors. The analysis revealed that the four-factor solution explained 67% per cent of the total variance, compared to the two- and three-factor solutions, where the explained variance was lower or close to 60 per cent.

Fig. 2
figure 2

Principal component analysis of the changes in AI value estimates. (Composition of factors in analysis with factor numbers II–V and factor loadings)

The analysis also proved that the four-factor solution was the most suitable content-wise for characterizing the potential value agreements and conflicts across countries (Fig. 2). However, the factor analysis across single countries indicated some differences in the factor structures suggested by the Kaiser criteria (two factors in Germany, three in Estonia, and four in Sweden). In the case of the two-factor solution in Germany and the three-factor solution in Estonia, the statistically suggested factors (based on the Kaiser criterion of eigenvalue > 1) explained less than 60% of the total variance.

In addition, we also conducted multi-dimensional scaling (MDS) analysis to test the correlations between the single variables and the emerging latent dimensions which is visualized in Fig. 3 by placing the value items across a four-field matrix. The analysis confirmed the distinguishing values of efficiency and diversity, which had the highest loadings in the factor analysis (highest in the 5-factor solution and lowest in the 2-factor solution) and stood out in the MDS space (Fig. 3). The MDS analysis revealed two underlying dimensions based on how the values converged and diverged in the multidimensional space. The first dimension (the x-axis) explained the value of maintaining the status quo by protecting the interests of individuals, institutions, the state, or society (e.g., privacy, security, autonomy, and welfare). In contrast, the second pole of this axis emphasized the orientation toward universal change and movements (e.g., sustainability, interoperability, and monitoring). Based on this, we named this dimension orientation to change vs. maintenance. The second dimension (the y-axis) reflected the orientation to opportunities through available resources (e.g., efficiency and diversity) or competition over these resources (e.g., security, sustainability, and autonomy). Therefore, we named this dimension an orientation to resources, whether in the form of opportunities for resources or competition over resources.

Fig. 3
figure 3

AI value dimensions (multidimensional scaling analysis)

5 Discussion

In previous studies, different human value models have been proposed (Hanel et al. 2018), and the need for different models has resulted from social changes. One of the global changes societies face is the increased use of AI solutions in various fields. This study attempted to contribute to developing a framework of value in AI by proposing items measurable among the general population and across countries.

This study confirmed our prior assumption inspired by prior research on human values (Hanel et al. 2018; Cieciuch et al. 2014; Davidov et al. 2014; Rokeach 2008; Inglehart et al. 1998; Schwartz 1994) that people express their understandings, agreement and disagreement in regard of publicly formulated value principles on AI. However, further studies are needed to test the hypothesis if these found value items constitute basic values—a motivational force that is the basis for how they behave daily in various contexts (e.g., concerning social media algorithms) and as citizens (e.g., whether they are ready to share data about themselves for developing AI). We do not know yet either if these values are the basis for shaping everyday communications with artificial artefacts (e.g., shaping new behaviour patterns and interaction with machines). However, this research revealed that the evaluation of values and underlying value dimensions were significantly associated with implementing automation in specific domains (e.g., evaluating social solidarity is related to a lower agreement in automation in police). Therefore, we assume these values might be intrinsically rooted and prevalent in human minds, motivating them to develop, design and use AI solutions.

We also agree with prior research that the values on AI might be in accordance but also in contradiction with the ideals of public managers (Bannister 2014; Rose et al. 2015) or public values reflected by governments (van Noordt and Misuraca, 2022). For example, whereas efficiency drives public value in AI use cases in European governments (van Noordt et al. 2024), the citizens tend only modestly to value it. Conversely, the social engagement values were relatively marginal in the AI governmental use cases, although citizens have significantly valued the social engagement values like justice. We also agree with prior research (Bannister and Connolly 2014) that the citizens’ value dimensions on AI can be embedded in the general normative value ideals the public administrations might have had in developing general information and communication technology solutions in the context of broader digital transformations. For example, the four value dimensions of efficiency, social diversity, general monitoring and protection of personal interests found in this study are comparable with the public ICT values of efficiency, engagement, service and professionalism in prior studies (Bannister and Connolly 2014). But further studies are needed to test if and how the emerged human values on AI are in accordance, related and statistically correlated with the value dimensions found in the research on public values or how these values are embedded in the digital transformation ideals (Masso and Männiste 2020; Banninster and Connolly 2014).

Because the development of AI systems is highly context-specific, this study on the variations of AI values across three countries indicates the difficulties in transferring AI solutions to different social contexts (Masso et al. 2022). Therefore, this study also suggests a need for implementing ethical criteria in different domains and countries and different ways of evaluating AI applications depending on the specific domain and national context. The spatial and temporal comparison of AI values might be effectively used for estimating the social shifts in societies or specific domains. The latter might be a desirable aim, especially when institutions and platforms that collect the data to develop AI solutions are operating globally. Still, the values on how to design, use, and implement these solutions rely on socio-culturally embedded understandings. This is an opportunity but also a difficulty when developing a framework of AI values that is empirically measurable among the general population since the understandings and meanings related to AI vary across countries.

Thus, this study does not assume that the values embedded in AI are static and less dependent on the context in which they will be used or the nature of the technology. AI can be used for many tasks and purposes, some less impactful than others. In that respect, future research may use scenarios with different concrete applications to see if and how the values change. Furthermore, as a future direction of design science, further studies can specify how different values are assigned to AI applications and how these values change over time.

6 Conclusion

This article strived to contribute to the discussions on the design and implementation of AI systems and the lack of coherence and common understanding regarding the values these systems entail. We evaluated empirically and compared the value items of AI to the citizens based on representative surveys conducted in Estonia, Germany, and Sweden.

This study postulated the existence of 15 AI value items: efficiency, privacy, diversity, justice, equality, accountability, transparency, security, welfare, sustainability, monitoring, solidarity, explainability, autonomy, and interoperability. We assessed these items in a comparative cross-country survey. We measured how the citizens of three countries consider the values related to artificial artefacts as developed and expressed publicly and as made visible in numerous AI evaluation criteria and guidelines (Hagendorff 2020). Four value types were found in this analysis: (1) protection of personal interests to ensure social benefit, (2) general monitoring to ensure universal solidarity, (3) social diversity and assuring social sustainability, and (4) efficiency. We also found that these values and value types can be ordered along two dimensions: orientation to change vs maintenance and orientation to opportunities vs competition over resources.

This study has revealed that these AI value dimensions vary across countries, domains, and individuals’ socio-demographic backgrounds. This study suggests that there might be values that are more universally understood and expressed across individuals, countries, and domains, such as the value of social diversity and ensuring sustainability. However, other values might be more specific to domains, countries, or individuals. Furthermore, there is a significant difference in the evaluation of values embedded in using AI in different domains, like solving crimes, versus using AI in public administration to distribute social benefits or health insurance.

This study is not free of limitations. The online study enabled the evaluation of the expression of values related to designing, using, and implementing AI technologies among those population groups potentially being more prone to use and more aware of automated decision-making. Online panels are a valuable source for collecting data on AI because developers often use the people participating in such platforms to test the AI solutions and collect valuable data to develop such solutions (40 per cent, see, e.g., Ibrahimi et al. (2023). However, the overrepresentation of respondents with higher education and the underrepresentation of older generations in the panel used in this study enables us only to make generalisations about the digitally active population groups. Therefore, additional comparative studies are needed to test, evaluate, and validate the value dimensions of AI found in this study in larger samples. In addition, a comparison of the identified value dimensions with human values is needed to estimate better the external validity of the AI value dimensions found in this study. Additional cognitive interviews would make it possible to understand people’s connections with these values better and improve the developed list and wording of value objects.