Skip to main content

Advertisement

Log in

Education for AI, not AI for Education: The Role of Education and Ethics in National AI Policy Strategies

  • Article
  • Published:
International Journal of Artificial Intelligence in Education Aims and scope Submit manuscript

Abstract

As of 2021, more than 30 countries have released national artificial intelligence (AI) policy strategies. These documents articulate plans and expectations regarding how AI will impact policy sectors, including education, and typically discuss the social and ethical implications of AI. This article engages in thematic analysis of 24 such national AI policy strategies, reviewing the role of education in global AI policy discourse. It finds that the use of AI in education (AIED) is largely absent from policy conversations, while the instrumental value of education in supporting an AI-ready workforce and training more AI experts is overwhelmingly prioritized. Further, the ethical implications of AIED receive scant attention despite the prominence of AI ethics discussion generally in these documents. This suggests that AIED and its broader policy and ethical implications—good or bad—have failed to reach mainstream awareness and the agendas of key decision-makers, a concern given that effective policy and careful consideration of ethics are inextricably linked, as this article argues. In light of these findings, the article applies a framework of five AI ethics principles to consider ways in which policymakers can better incorporate AIED’s implications. Finally, the article offers recommendations for AIED scholars on strategies for engagement with the policymaking process, and for performing ethics and policy-oriented AIED research to that end, in order to shape policy deliberations on behalf of the public good.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Availability of data and material

Not applicable

Code availability

Not applicable

Notes

  1. Australia, Canada, Singapore, Denmark, Denmark, Taiwan, France, EU, UK, and South Korea have committed nearly 8 billion with the US contributing or planning to contribute at least 4 billion and China at least 14 billion. These are moving targets and low-end estimates, especially as private investment constitutes an even greater sum.

  2. I do not include documents produced by intergovernmental bodies, such as the United Nations or European Union. While these documents are similar in nature, they are less tied to direct national institutions and strategies, and are therefore less analogous to the other documents. For further details and rationales regarding data collection and inclusion/exclusion criteria, please see Appendix 1.

  3. Further details about codebook development and iteration are available in Appendix 1.

  4. For example, Malta’s (2019) document initially notes that AI for healthcare may be amongst the highest impact projects and “most pressing domestic challenges” worthy of prioritization, but it does not proceed to include any substantive discussion or a subsection on healthcare. In comparison to the document’s discussion of other topics, and in comparison to other countries’ AI policy documents that discuss healthcare in more depth, this relatively more narrow treatment of the topic led to coding it as yellow. Similarly, Russia’s (2019) brief mention of using AI to “[improve] the quality of education services” does not provide enough detail to be clear about the role of AIED as a potential tool for teaching and learning, and so is considered too ambiguous to code as either green or red.

  5. Any errors are the sole fault of the author.

  6. I nevertheless captured these mentions in the research memo and discuss them at the end of this section.

  7. Note that other policy sectors, like healthcare, also have dedicated agencies and other policy documents, but nevertheless receive more attention than education in AI policy strategies.

  8. See The World Bank Country and Lending Groups classification for classifications by national income.

  9. There are additional explanations to consider as well, though this study does not provide clear evidence to establish these. First, policymakers may simply never have been informed about AIED. AIED has typically been contained within an expert academic domain primarily accessible to computer scientists (Schiff 2021). Relatedly, while some AIED applications are beginning to hit mainstream classrooms, relatively few people, adults or children, have experience with them personally. In contrast, general education and its role in the labor market is something that nearly all members of society experience and can relate to, and a traditional focus of policymakers. Barring basic awareness, policymakers may not realize the transformative potential of AIED. If this is accurate, a clear prescription is to significantly increase efforts to inform policymakers about AIED and its ethical implications. This begs the question of whether the AIED community is prepared to do so, something which this article addresses in its Recommendations for AIED Researchers section.

  10. The discussions of Spain, Mexico, Kenya, and India demonstrate how such a link between ethics and policy for AIED might be established even though most countries have not yet identified these connections explicitly.

  11. Note that a similar approach has also been adopted by The Institute for Ethical AI in Education (2020), which, through a series of workshops and reports, has explored AIED policy by using a different AI ethics framework, the EU’s seven Ethics Guidelines for Trustworthy AI (European Commission 2019). This provides further support to the idea of approaching AI governance through an ethical lens.

  12.  See Schiff (2021), The Institute for Ethical AI in Education (2020), Holmes et al. (2021), and other articles in this issue for more detailed reviews of AIED ethics.

References

Download references

Funding

The author declares no funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Schiff.

Ethics declarations

Conflicts of interest

The author declares no conflicts of interest or competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1: Methodology

Appendix 1: Methodology

This appendix provides more extensive details surrounding the data collection, screening, coding, and analysis strategy, as well as associated limitations, than is available in the main body of the article.

Data Collection

The data collection process differs from that typically employed in meta-analysis and review papers, such as recommended by the PRISMA Statement (Moher et al. 2009), largely because the documents evaluated here are so-called “gray literature” not found in traditional databases (Mahood et al. 2014; Rothstein and Hopewell 2009). Given the type of documents and substantive focus of the study, the data collection process relied on linkhubs, Google searches, and manual searching of certain documents and countries: e.g., “country name + AI policy strategy.” The search process is connected to ongoing research assessing AI ethics and policy (Schiff et al. 2021, 2020a, 2020b). The first author and colleagues have maintained a database of AI policy documents and additional AI ethics documents (available at https://dx.doi.org/10.21227/fcdb-pa48). The database was created in Fall 2018 and updated regularly until early 2020. Data collection focused on policy documents benefited especially from lists maintained by the Future of Life Institute (2020) and Tim Dutton (2018). These linkhubs contain updates about AI policy developments in dozens of countries, including how far along countries are in development of task forces, funding proposals, and formal AI policy strategy documents.

For each country noted in these linkhubs, I searched for and accessed documents and initiatives mentioned, and performed additional Google searches and manual searches for each country to ensure that the set of national AI policy strategies was as complete as possible. While it is possible that some countries were omitted, perhaps due to lack of language familiarity, the two key sources are invested in tracking national AI policy developments. All such candidate documents were thus captured in the database managed by the first author and colleagues. From this larger database, I extracted only documents produced by public sector organizations (e.g., countries, not corporations or non-governmental organizations). This resulted in 76 public sector AI documents that formed the candidate pool of national AI policy documents.

Screening Process

The screening process involved identifying criteria for types of documents, publication language, the population and phenomena of interest, and time period (Stern et al. 2014). The purpose of this study was to assess national AI policy strategies as they relate to education. Therefore, the study applied the following inclusion/exclusion criteria:

  • Documents needed to be complete and resemble a national AI policy strategy. Some such documents describe themselves as preliminary reports or blueprints, working towards more robust or formalized policy strategies. Nevertheless, many were sufficiently robust so as to be considered as policy strategies. On the other hand, countries that had only announced task forces, funding initiatives, created websites, or otherwise did not have a well-developed document analogous to that of other countries were not included. This ensures countries can be compared fairly and speak to sufficiently detailed items of policy, with a sizable average of 62 pages per document.

  • Documents needed to be in English, due to the author’s limited language proficiency. However, in a number of cases, governments had produced official English-language translations (e.g., Finland, Italy). While automated translation of non-English documents (e.g., Google Translate) may not be of sufficient quality, there was one unofficial but high-quality translation included in the final sample, of China’s AI policy strategy, performed by the Foundation for Law and International Affairs.

  • The study also excluded documents produced by inter-governmental organizations, such as the United Nations, the Organization for Economic Cooperation and Development, and the European Union. While these documents are no doubt important, they address a different scope, as they are relatively distant from national-level institutions, funding activities, and other policy activities, such as those involving education policy. This makes these documents less comparable to national-level AI policy strategies.

  • Finally, in a number of cases, countries produced multiple documents that were potentially relevant to AI policy. Only one document was selected per country. The chosen document was typically the most robust and the most recent, at times an evolution of a previous draft or more preliminary document. Further, some candidate documents were not representative of an overarching national AI policy strategy. For example, documents from Germany addressing autonomous vehicle policy and from Finland addressing work in the age of AI were excluded in preference of Germany’s National Strategy for AI and Finland’s Age of AI. This screening criteria helped to ensure that individual countries were not overrepresented, that information analyzed was not redundant, and that the most robust, high-quality, and comparable policy strategies were selected in each case.

Of the 76 candidate documents, eight did not resemble a complete national policy strategy document, one was not available in English, 13 were inter-governmental, and 30 were excluded in favor of more representative documents. Screening resulted in a final sample of 24 national AI policy strategies.

Codebook Development

After identifying the final sample, the analytical strategy began with the development of a preliminary set of topics, in the form of a codebook (Miles et al. 2014; Thomas 2006). These topics in the codebook were chosen based on the study’s conceptual scope and framework, the author’s subject matter knowledge, and previous exposure to AI policy strategies. The scope of interest was any discussion of education, construed as broadly as possible, such as youth and adult education, training and re-skilling, investing in educational systems, public education and awareness, the need to develop more AI talent (e.g., computer scientists, engineers), and social and ethical issues related to AI and education.

The initial codebook included 11 categories: Education as Priority Topic, K-12 Education, Post-Secondary Education, Adult Education and Training, General AI Literacy, Training AI Experts, Preparing Workforce, Intelligent Tutoring Systems, Pedagogical Agents and Learning Robots, Predictive Educational Tools, and AI for Healthcare. A best practice in qualitative research is to iterate and refine the codebook through testing on a small subset of the data (Roberts et al. 2019). Therefore, I randomly selected five documents—aiming for a meaningfully-sized and somewhat representative subset—and applied the thematic schema to them. This involved reading the documents to determine whether the coding schema could validly and straightforwardly reflect the way education was discussed in the documents, and to identify if the coding schema captured the full range of issues in the documents relevant to the article’s conceptual scope.

Based on this initial test, several categories were modified, removed, and collapsed as follows:

  • Education as Priority Topic and AI for Healthcare were retained, as they were easy to apply. Either topic might be explicitly noted as a priority topic in a document, for example, if a list of priority policy sectors was mentioned and education was among that list. Alternatively, education/healthcare were coded as priority topics if a significant subsection was dedicated to them, or if there was a similar amount of discussion relative to the length of the document as compared to other documents that did identify education (or healthcare) as an explicit priority.

  • K-12 Education, Post-Secondary Education, and Adult Education and Training were removed. These categories were originally designed to separate discussion of education by target age/population group. However, the test documents often did not identify the target age/group when discussing AI and education, making this distinction difficult to code accurately. Moreover, these population differences were deemed less relevant for the overall purpose of the article. For example, that documents emphasized the need to develop more AI researchers seemed more pressing to the document authors than whether this development happened in secondary or postsecondary educational institutions.

  • Training AI Experts and Preparing Workforce for AI were straightforward and were retained.

  • General AI Literacy was renamed to Public AI Literacy. The former was originally defined to emphasize development of general digital, STEM, and other skills in educational settings. The theme was relabeled and redefined to incorporate AI literacy in both educational (classroom) and ‘public’ settings, because both settings were discussed and justified to pertain to similar policy purposes.

  • The revised codebook collapsed Intelligent Tutoring Systems and Pedagogical Agents and Learning Robots into Teaching and Learning. Too few documents addressed these issues at the level of detail of individual AIED technologies or tools to allow for reliable identification, as the documents generally employed more abstracted terms and discussions.

  • The revised codebook also abstracted Predictive Educational Tools into Administrative Tools, as there were several examples of AIED tools mentioned that were better captured by the latter, broader terminology, such as the use of AI for inspection or assigning teachers to schools.

These adjustments resulted in a revised codebook with seven categories (a reasonable number for inductive studies) (Thomas 2006), described in the main body. The final coding categories were straightforward to apply to the data and captured relevant concepts within the study’s scope well.

An important note is that, despite an initial attempt to code for discussion of AIED ethics specifically given its importance to this study, discussion of these topics was too rare to justify having as a theme. Most discussion addressing ethics and education was focused on Education for AI purposes, such as training future machine learning experts to develop ethical design skills, rather than addressing ethical implications emanating from AIED. Nevertheless, I captured all mentions of ethics in the context of both Education for AI and AI for Education in my memos, and considered the presence and absence of these topics as part of the interpretive work.

Coding Approach

Next, I applied the codebook to the 24 documents in the sample (approximately 1491 pages total). Each document was read closely and assessed manually along the seven topics using a simple form of content analysis. This consisted of evaluating each document for the presence or absence of each theme (White and Marsh 2006), largely a binary exercise, though some documents were coded as borderline cases. In Table 1, a country is marked as green when a theme was reflected, red when absent, and yellow when the case was sufficiently ambiguous or borderline.

For example, Malta’s (2019) document initially notes that AI for healthcare may be amongst the highest impact projects and “most pressing domestic challenges” worthy of prioritization, but it does not proceed to include any substantive discussion or a subsection on healthcare. In comparison to the document’s discussion of other topics, and in comparison to other countries’ AI policy documents that discuss healthcare in more depth, this relatively more narrow treatment of the topic led to coding it as yellow. Similarly, Russia’s (2019) discussion of using AI to “[improve] the quality of education services” does not provide enough detail to be clear about the role of AIED as a potential tool for teaching and learning, and so is considered to be too ambiguous to code as either green or red.

Analysis Approach

Relevant quotes from the documents were captured in a research memo and organized under the seven categories (Thomas 2006) to support higher-order conceptual and thematic interpretation. Additional quotes of interest and minor categories were included here as well, such as any mentions of ethics related to education. From this, I synthesized insights from the frequency and character of these topics, applying a thematic analytic approach (Castleberry and Nolen 2018) to identify major findings. This interpretive exercise involves considering second-order meanings or explanations for the patterns identified in the data (Miles et al. 2014), including the finding that AIED’s ethical implications are neglected. I present results for each topic in the main article, along with interpretation of key findings within and across topics, to support a broader discussion of the role of education and ethics in AI policy in the subsequent sections.

Limitations

Because the documents were coded by a single researcher, it is not possible to, for example, assess inter-rater reliability. Further, the conceptualization of the study, codebook development, and interpretation were not subject to the perspectives of other researchers or experts outside of the peer review process. However, quantitative measures of reliability are only sometimes considered essential in qualitative research (Castleberry and Nolen 2018), and a single coder approach can be appropriate and, in cases, even preferrable (Harding and Whitehead 2013). Multiple researchers may not be necessary to provide sufficient consistency and credibility, as single researchers can provide a unitary and consistent perspective, albeit one dependent on that author’s subjective assessments. For example, research using semi-structured interviews with dozens of coding categories and many degrees of detail (e.g., scoring attributes from 1–10) benefit especially from the assessment of interrater reliability, particularly if the codes are challenging to conceptually separate or define. In this study, however, the number of topics is small, the level of detail simple, and the concepts are fairly easy to conceptually separate.

Moreover, in qualitative research, there are common criteria of research rigor used as alternatives to traditional quantitative criteria of validity and reliability. For example, one widely used set of criteria comes from Lincoln and Guba (1985), who propose credibility as an alternative to internal validity, transferability as an alternative to external validity, dependability as an alternative to reliability, and confirmability as an alternative to objectivity. To satisfy these criteria, the analysis employed several recommended strategies (Lincoln and Guba 1986). Within-method triangulation across multiple documents (Jonsen and Jehn 2009) and the use of direct quotes as descriptive evidence provide rich support to demonstrate claims, supporting their credibility, dependability, and transferability. Further, because the data are publicly available, as opposed to privately held interview data, for example, they are open to scrutiny and confirmation or disconfirmation. However, in part because of researchers’ individual positions and biases (Castleberry and Nolen 2018), it is possible that other researchers would identify different coding categories or identify different salient themes. As such, the single researcher approach is a limitation of this study, discussed in the study’s limitations section. Future research examining the role of education in AI policy would be welcome in assessing the extent to which the findings presented here are indeed credible, dependable, confirmable, and transferable.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schiff, D. Education for AI, not AI for Education: The Role of Education and Ethics in National AI Policy Strategies. Int J Artif Intell Educ 32, 527–563 (2022). https://doi.org/10.1007/s40593-021-00270-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40593-021-00270-2

Keywords

Navigation