The recent introduction of generative artificial intelligence (GenAI), such as ChatGPT, in the public domain has greatly accelerated the reality of a digital future dominated by intelligent machines and chatbots. GenAI is a special type of AI that focuses on creating new and original content, whereas AI in general encompasses a broader range of machines designed to simulate human intelligence and cognitive functions. With the ability to generate original text, image, sound, and computer code in an instant, GenAI has the potential to transform and disrupt the nature of how we learn, think, and work. Its advent marks a seismic shift in education and raises many serious questions and issues regarding teaching, learning, and assessment across all subject areas.

In science education, the unique epistemological nature of science presents further distinct questions and challenges compared to other disciplines. Natural science, as the study of nature, is deeply rooted in our interaction with materiality. Our connection with material things in the physical world fundamentally shapes the ontology of what we understand as nature and the epistemology of how we understand nature. On the other hand, as the information processing and output from GenAI is based on statistical probability and semantic relationships among words in large language models, they are devoid of meaning to material things in the physical world. Thus, in a future dominated by GenAI, how will its pervasive use distort our connection to the physical world and consequently reshape our views of science and science education? What must we emphasize and teach about the role of material things in science education in order to maintain our existing connection and reality to the physical world?

The aim of this paper is to draw attention to the role of materiality and its significance in our conversation about a future embedded with GenAI. Materiality is central to the nature of science, yet it has often been overlooked and sidelined because science education researchers tend to focus more on the role of thought and language in the formation of scientific knowledge (Hetherington et al., 2018; Scantlebury & Milne, 2019). As Scantlebury and Milne (2019) put it, while language is crucial to the construction of science and science education, it also “has too much power, and, with that power, there seems a concomitant loss of interest in exploring how matter contributes to both ontology and epistemology in science and science education” (p. 3). In the face of an increasing divide between what is real and artificial, there is more urgency now to anchor our research and discourse in the material roots of science. Understanding the “parameters of artificiality, or the ways in which computers are unlike human intelligence” (Cope et al., 2021, p. 1230), and recognizing the importance of materiality can help shape what is “real” about our collective knowledge and practices of science.

In this theoretical paper, we argue that materiality encompasses a “material reality” that is central and indispensable to human culture and knowledge. This perspective sheds light on how the creation of scientific knowledge is inseparable from our interaction with material objects, and it prompts us to consider what this process means now that we have GenAI. With GenAI expected to play an increasingly prominent role in knowledge creation, there is a particular need to understand what is accepted as credible sources of truth, or epistemic authority in science, and associations with material reality. To support our argument, this paper begins with a discussion of the emerging paradigm of new materialism in social science inquiry, as well as research from science studies that informs our understanding of scientific practices. Drawing from these perspectives, we explain how materiality is central to the epistemic authority of science and argue for its greater recognition in a world of GenAI built on large language models. The paper then concludes with recommendations for research and teaching involving the use of GenAI in three specific areas: practical work, material inquiry and argumentation, and learning with GenAI.

1 Theoretical Framing

This paper is framed by two interdisciplinary perspectives that are emerging in education research and inquiry: new materialism and science studies.

1.1 New Materialism

To discuss new materialism, it is first essential to differentiate it from traditional (old) materialism. Materialism is an old philosophy dating back to ancient Greek atomism and later Newtonian mechanistic worldview of forces and matter (Gamble et al., 2019). Materialism reduces reality to solely material interactions of passive objects, with humans as external observers having no influence in the physical world. It maintains a clear separation between material objects and human actions (including thought, language, knowledge, and consciousness). However, in new materialism, there is a radical departure from this assumption.

New materialism is an interdisciplinary movement and inquiry that gained prominence in the humanities and social sciences around the millennium (Sencindiver, 2017). It involves a paradigm shift or “material turn” to challenge anthropocentric assumptions of human culture and deconstruction of longstanding binaries between humans and non-humans, culture and nature, and meaning and matter. With its roots in philosophy, feminism, science and technology studies, and literary and cultural studies, new materialism sees matter as an active part of an interconnected web where agency is distributed among both human and non-human actors.

Central to new materialism is the idea that reality is shaped by matter’s interactions with one another, and not by the inherent properties that pre-exist within matter as a passive thing. This concept, termed “performative” materialism by Gamble et al. (2019), suggests that materiality emerges from the performative nature of matter’s relationships. Barad (2003) aptly sums up this post-humanist performative view: “matter does not refer to a fixed substance; rather, matter is substance in its intra-active becoming – not a thing, but a doing, a congealing of agency” (p. 822). She introduced the term “intra-action” to replace “interaction” as the latter presumes the existence of pre-established and distinct entities that inter-act with each other. In her view, all entities are always entangled with each other in complex ways and thus their existence cannot be predetermined outside of their intra-actions with one another. At the same time, the property of any discernible thing does not precede or remain unchanged by its actions with other things.

Barad’s (2007) concept of intra-action is pivotal as it challenges conventional notions of causality and agency. Causality has traditionally been perceived as a one-way process where one entity acts upon another. However, Barad (2007) argues that causality is a mutual and reciprocal process where entities are both the cause and effect of each other’s actions. In other words, entities co-create each other. Similarly, agency is not located within a single entity, be it human or non-human, but is instead produced and distributed in the intra-actions and relationships between entities. This “agential realism” worldview prompts a re-evaluation of our understanding of the world, emphasizing interconnectedness and challenging the perception of entities as isolated phenomena. It underscores that our knowledge is deeply intertwined with the intra-actions in the physical world, advocating for a relational, networked perspective.

1.2 Science Studies

Science studies is an interdisciplinary field that encompasses the history and philosophy of science (HPS) and science and technology studies (STS) (Ford & Forman, 2006). Focusing on scientists’ everyday practices, this field has gained prominence with the recent “practice turn,” shifting the focus from inquiry to practice in science education, as exemplified by the Next Generation Science Standards outlining eight key scientific practices in the curriculum (Erduran, 2015; Stroupe, 2015).

A key aspect of science studies is understanding the nature of science based on what scientists actually do in practice to generate knowledge claims, which contrasts with the philosophical and idealized views of science (Jiménez-Aleixandre & Crujeiras, 2017). Seminal research by Latour and Woolgar (1979) demonstrated how scientists transform material substances into various forms of inscriptions, like data charts and scientific papers. This process involves building a network of “allies,” including both human collaborators and non-human elements like instruments and data, which support the scientific claims. As these networks strengthen, the claims gain acceptance as scientific facts. The assemblage of allies involved will then be black-boxed until somebody questions them in light of new data.

Ford and Forman (2006) describe the nature of science as an interplay between the social and material aspects of scientific practice. The social aspect emphasizes the role of scientific communities in public debates, knowledge validation, and research processes (Mody, 2015). This perspective presents a more complex version of science, away from a formulistic “scientific method” to more dynamic, socially-driven research practices (e.g., Chin & Osborne, 2010; Kim & Song, 2006). Studies by researchers like Kim and Roth (2014) underscore the social nature of scientific argumentation, highlighting the importance of dialogic discourse and peer interaction.

The material aspect, on the other hand, emphasizes the role of material things and non-human actors in scientific knowledge creation. These material things include the objects of inquiry that scientists investigate (e.g., SARS-CoV-2 virus, Huc enzyme, M87 black hole) as well as the tools, instruments, and machines that render those objects knowable and observable in the first place. Despite their importance, the material aspect often receives less attention compared to theory development. As Milne (2019) critiques, “With that theory focus, science education loses sight of what material things, like the humble thermometer, have contributed to the sociocultural milieu of science and to the learning of science as a form of doing, acting, and making” (p.12). Thus, materiality plays a critical role in both empirical investigations and theoretical developments in science.

Furthermore, the material aspect of scientific practice also determines the epistemic authority of what counts as valid knowledge within science (Ford & Forman, 2006). While scientists frequently engage in debates and challenges to one another’s claims, the final arbiter rests on what can be observed in the physical world through an assemblage of non-human allies (Latour, 1987). Material objects themselves do not function as arbiters of scientific claims, but rather their epistemic authority arises from the interplay between both the social and material aspects of scientific practice, which Pickering (1995) describes as “mangle of practice,” between objects and humans. Thus, the epistemic authority in science is established through this back-and-forth dance of resistance and accommodation between material objects and scientists. While the community of scientists plays an important role in this process, material objects also have an equal role in grounding the authority of scientific knowledge.

Finally, science studies emphasize the shared agency between humans and non-humans, aligning with post-humanist perspectives like Barad’s (2003) agential realism and Gamble et al.’s (2019) performative materialism. Both new materialism and science studies therefore present a common post-humanist narrative that positions materials as co-participants in human culture and knowledge. This perspective is particularly crucial for how we perceive the role of GenAI in science and science education. In summary, science studies offer a comprehensive view of scientific practice, encompassing both the social and material aspects. This field emphasizes the dynamic, collaborative, and material-dependent nature of scientific work, challenging traditional views of science as a rigid, human-centric discipline.

2 Materiality and Generative Artificial Intelligence

How humans make meanings with language and how GenAI produce human-like text through language are fundamentally different. Although the use of language is crucial in both human cognition (Vygotsky, 1986) and AI data processing (Jeon & Lee, 2023), the way words and symbols are processed in the acquisition of meaning are vastly dissimilar. Human meaning making is always grounded and intertwined with materiality in a physical context (Barad, 2007; Gamble et al., 2019). According to new materialism, it is impossible to separate matter and the meaning we create about it. Human knowledge and culture are consequently connected to the physical world we inhabit. In contrast, the language processing and outputs generated by AI are not contextualized to a material reality. It is therefore important to understand these differences to appreciate the unique roles that humans and GenAI play in the construction of scientific knowledge.

2.1 How Humans Make Meaning with Words and Symbols

To illustrate more clearly how humans make meaning, it is useful to distinguish, for the purpose of this discussion, two different dimensions of meaning called the intensional and extensional world by Hayakawa (1990). The extensional world refers to the world of objects and events that we directly know based on our firsthand experiences of seeing, hearing, feeling, and being physically present. It corresponds to the world of material objects, including our own bodies. On the other hand, the intensional world represents the world of ideas that we know through words and symbols. These are the second-hand knowledge that we learn through hearing others talk about them or reading reports (and reports of reports) from various sources. Most of the knowledge we learn in schools comes from the intensional world. As Hayakawa’s (1990) notion of language is limited to speech and written text, the intensional world for him is also the verbal world. However, we can also extend Hayakawa’s intensional world to include other multimodal means of communication, such as image and sounds.

Human meaning-making involves two mutually dependent processes in both of these worlds, as visually summarized in Fig. 1. The first process occurs when we associate the meaning of a word based on what it stands for in the extensional world. This meaning, called the extensional meaning of the word, includes all the objects and actions we have encountered and experienced in the physical world. For instance, most people have seen a dog in real life, and we associate the word “dog” with that creature we have encountered in the past or present. It is important to note that the pronunciation and spelling of the word “dog” in English or in other languages (e.g., Spanish — perro, Greek — σκύλος, Hebrew — כֶּלֶב, Korean – 개,) has no inherent connection to the creature represented by the word. Instead, the relationship between the word and the thing it represents is purely symbolic. According to Peirce (1986), meaning-making in the extensional world involves a triadic interaction among three elements: the sign, object, and interpretant. The sign refers to the word or symbol used to represent an object, the object refers to the thing represented by the sign, and interpretant is the meaning created in the mind of a person interpreting the sign. Thus, this meaning-making process is inherently human. It is also how young children learn everyday words like “dog” by hearing the pronunciation of these words while seeing the accompanying objects or experience certain situations through our bodies. In this sense, extensional meaning is always made within a physical context, involving surrounding material things and our bodies.

Fig. 1
figure 1

How humans make meanings in the intensional and extensional worlds

Simultaneously, we also make intensional meaning of a word based on its relationship with other words. This process involves understanding the meaning of a word by using more words to provide synonyms, definitions, or a list of attributes associated with the word. It is also how the meaning of a word or phrase is defined in a dictionary. This meaning can be explicitly spoken or written in our communication, as well as mentally processed as inner speech in our thought (c.f., Vygotsky, 1986). Regardless, the intensional meaning of a word is not found or contained within the word itself; rather, the person hearing or reading the word needs to first understand the lexicogrammar of the language (Halliday, 1985) and then construct its meaning based on the semantic relationships among other words (Lemke, 1990). For instance, if someone has not seen a dog, they might learn the meaning of this word by reading or hearing a description of it, and consequently associating it with a kind of mammal or pet (a classifying relationship), having four legs and a long snout (a compositional relationship), making a barking noise (an attributive relationship), or looking like a wolf and fox (a comparative relationship). This is how any word is often associated with a list of other words based on their mutual semantic relationships in a network of words.

Language, however, is not just limited to the use of words and verbal modes of communication. According to researchers in social semiotics (e.g., Halliday, 1978; Kress & van Leeuwen, 2006; Lemke, 1998), language is defined as a symbolic system of culturally shaped signs used by a community to make meanings in a social context. These signs can include alphabets, icons, shapes, and symbols. In this sense, the systems of visual diagram, mathematical symbolism, and computer coding are also considered languages in so far that they are used in different cultural ways of meaning-making. Just as meaning-making with verbal language is based on recognizing the semantic relationships among words, meaning-making with other symbolic systems is based on recognizing the connections among their constituent symbols, for example, line, shape, space, and color for visual representation (Kress & van Leeuwen, 2006), and numerical notations and algebraic symbols for mathematics (O'Halloran, 2006).

Our ability to make intensional meaning using all the languages at our disposal is an important part of knowledge building, as it allows us to extend and generalize our knowledge beyond our immediate physical context. As extensional meaning-making relies on things that we can directly see and point at, its usage is limited. Through language (as a symbolic system of words and signs), we can acquire knowledge of things, places, and events we have not personally seen or encountered. This is how we learn about past and distant objects like the Tasmanian tiger, Antarctica, and Krakatau eruption by reading words and viewing images without physically being there. We can even use the same language to create imaginary and fictional worlds that do not exist, such as Tolkien’s Middle-earth or Pandora in Avatar.

Intensional meaning-making is also essential in our process of abstraction. For instance, the words we use in the intensional-verbal world often capture general features of the associated objects or experiences while leaving out many specific and detailed characteristics. The word “dog,” for instance, does not often refer to a specific dog that we see, but it serves as a classification (or word concept) referring to a group of creatures that share common characteristics. Although every dog in the world is unique and different from other dogs, those differences are disregarded in the process of abstraction. If we want to be more specific, we can create new words to specify sub-categories of dogs (e.g., Poodle, Toy Poodle, Toy Poodle in Australia), but these words still represent a class of many physical entities. In contrast, if we want to be more general, we use words to further generalize the range of things, encompassing not only all the dogs in the world but also their superordinate categories. For example, words like canine, mammal, and animal are concept words denoting a higher level of abstraction.

Our human ability to move up and down this “ladder of abstraction” through the use of language shapes the basis of abstract thinking (Hayakawa, 1990). However, it is crucial to note that this abstract thinking is meaningful only when there is something real (i.e., material objects) in the extensional world that supports the bottom of the abstraction ladder in the first place. Thus, while much of our knowledge about the world consists of ideas and abstractions constructed through the association of words and symbols in the intensional world, many of these words and symbols have a material basis or origin in the extensional world, as shown in Fig. 1. This is how human meaning-making is anchored to a material reality.

This connection to a material reality is even more critical in science, as its epistemic authority is built on the interplay between scientists and material objects (as discussed in the previous section). If we trace all accepted scientific claims back to their production in laboratories or research centers, we find a similar ladder of abstraction. In other words, what we can claim in science are abstractions built on a very long chain of transformations from material substances to inscriptions, achieved through the manipulation of matter, tools, and apparatus (Latour, 1987; Latour & Woolgar, 1979). There is a risk if our claims are defined solely in words and debated by people in the intensional world, without actually pointing to the extensional world of material things down the abstraction ladder. In such a scenario, our meaning-making would revolve in verbal circles within the intensional world, losing a meaningful connection with our material reality. This becomes a possibility when we employ GenAI tools without a firm understanding of how they function in contrast to how scientific knowledge is produced.

2.2 How GenAI Generate Human-like Text Through Language

GenAI utilizes large language models (LLMs) to produce human-like text. These LLMs work by training a deep neural network on vast amounts of text data consisting of billions of words from books, articles, and websites. During the training process, the model learns the complex patterns and relationships between words and phrases in natural language and assigns probabilities to each possible word or sequence of words that can follow a given input sequence. This input sequence can range from a few words to an entire sentence. When generating text, the LLM uses these probabilities to predict the most likely word or sequence of words to follow the input sequence (e.g., LLM prompt: the cat sat on the…?). This prediction is made by sampling from a probability distribution generated by the model. This process is then repeated, with each new prediction informing the subsequent predictions until the desired length of the generated text is achieved. By using these techniques, LLMs can produce verbal text that is contextually appropriate, fluent, and coherent (OpenAI, 2023a).

Currently, many GenAI systems are also multimodal in that they can “read” images, translate between text and images, and generate novel content that combines inputs from multiple modes. Meta has just announced a multimodal GenAI system that links together data from six different modes: images, text, audio, depth, thermal, and movement (Vincent, 2023). The way that GenAI can interpret and produce images is generally similar to how it generates verbal text using LLMs. It involves feeding the neural network a large dataset of images and training it to identify common patterns and features that are associated with particular objects, scenes, or actions. The process involves breaking down an image into its constituent parts (e.g., clusters of pixels), and analyzing the features in these clusters, such as color, shape, and spatial arrangement (OpenAI, 2023a). Thus, just like natural language, the generation of images or any other multimodal outputs from AI is based on a probabilistic prediction of the patterns and relationships among the signs used within the symbolic system.

GenAI models generate multimodal text based on the patterns and relationships within the symbolic system (e.g., verbal language, image, computer code) they are trained on. This means that their understanding of meaning is limited to the data used during training. But even if these models have access to an infinite amount of data, these data still exist within a self-contained intensional world of languages. Unlike humans, GenAI models cannot establish a connection to the extensional world as they lack access to the range of sensory inputs and embodied experiences that humans possess. In other words, while they can predict what words go with other words to form a coherent expression, or what pixels connect with other pixels to form an aesthetical image, they do not have the referential meaning of what those words and pixels stand for in the extensional world. This is a fundamental difference between how humans make meanings with language and how GenAI models generate human-like text through language. Table 1 summarizes the key differences with these two processes.

Table 1 Differences between human meaning-making and AI text generation

3 Discussion and Recommendations

With an understanding of how materiality matters in the distinction between human meaning-making and AI text generation, how should researchers and educators rethink the way science is taught and learned in a future that is likely to be dominated by the use of GenAI? This section highlights three particular areas that require further consideration within the science education community.

3.1 Practical Work for Learning Scientific Epistemic Authority

Firstly, the role of practical work needs to be strengthened as an indispensable setting for students to understand the epistemic authority in science and its connection to a material reality. Practical work, involving the manipulation and observation of material objects in a laboratory or fieldwork, has long been a staple component of formal science curricula in many countries (Abrahams et al., 2013). As GenAI increasingly facilitates science learning in the future, it is imperative not to overlook or diminish the importance of practical work in the curriculum. While GenAI can support certain aspects of practical work, such as generating ideas or assisting in report writing, it cannot replace the sensory and embodied experiences of manipulating material objects and engaging with scientific apparatus. Therefore, we need to reevaluate the role of practical work and discuss its value in science education within a future learning environment dominated by GenAI.

Currently, the emphasis on practical work often revolves narrowly around teaching and assessing investigation skills and laboratory report writing (Hetherington et al., 2018). There is a tendency to separate the learning of theoretical content in the classroom from the development of process skills in the laboratory, even though practical work is deeply intertwined and central to theory-building in science (Ferreira & Morais, 2014). As a result, practical work is often seen as a repetition of “recipe-style” experiments aimed at confirming or verifying previously learned theoretical concepts in the classroom (Abrahams & Reiss, 2012). Moreover, educators often view practical work and demonstration mainly as a means to motivate students and promote learner engagement (Holstermann et al., 2010). As the use of GenAI becomes more prevalent, there is a need to reposition the role of practical work as a means for students to explore and understand how epistemic authority in science is determined by our connection to a material reality. As discussed in this article, our entanglement with materiality forms the core foundation of our epistemic knowledge in science, and this is something that cannot be replaced by GenAI.

The current emphasis on the practice turn in science education research (Erduran, 2015) presents an excellent opportunity to link practical work with the foreseeable use of GenAI as an integral part of scientific practices. It is evident that practical work and the role of materiality are central to specific scientific practices like conducting investigation and arguing based on evidence (Walker & Sampson, 2013), while GenAI tools can be employed to facilitate modeling, data analysis, and communication (Bianchini et al., 2022). It is vital to avoid separating these two processes as isolated and independent activities by creating an artificial divide between empirical investigation (with material objects in the extensional world) and theory building (with GenAI in the intensional world). Instead, scientific practices need to be conceptualized and taught as an interconnected system of activities (Ford, 2015). We must establish a stronger connection between the manipulation and transformation of material objects in practical work to the “translation of inscriptions” (Latour & Woolgar, 1979) that include numerical data, graphs, equations, and written abstractions. Each translation involves a multimodal transformation of data from one mode of representation to another (e.g., material to numerical to graphical). By understanding how scientific evidence is constructed through this chain of multimodal transformation, students can better connect theoretical claims in science to their material origins.

It is likely that GenAI will be used to support and supplement practical work in the future, rather than replacing it. As previously mentioned, multimodal GenAI models can be employed to create complex models and visualizations in the data analysis, as well as guide students in designing experiments and writing laboratory reports. Some educators have also suggested using GenAI to create virtual laboratories, allowing students to conduct experiments without the need for physical resources and equipment (e.g., Dilmegani, 2023). While virtual laboratories offer interactivity and advantages in terms of safety and accessibility, it is essential to recognize they are simulated representations of real-world phenomena. Therefore, when GenAI is involved in practical work, it is crucial to differentiate between first-hand data generated from material manipulation and second-hand data obtained from digital and multimedia sources. These two types of data are not independent but are connected through a chain of translation over time (Latour & Woolgar, 1979). By extending and tracing this translation process, we find that all second-hand data (e.g., words, numbers, images) have a material origin at a specific time and location. Hence, if students are using GenAI in practical work, it is important to engage them in discussions about the source of second-hand data. These discussions should focus on tracing the material origin of any data source, by asking questions such as: How were these data produced? Who created them, and what actions and tools were involved?

3.2 Evidence from Material Forms of Inquiry and Argumentation

Even before ChatGPT and other GenAI tools were launched in the public domain, it has been claimed that we are living in a post-truth world rife with misinformation and fake news (Billingsley & Heyes, 2023). Public trust in science is declining as some of our basic scientific claims are being contested by conspiracy theorists, anti-vaxxers, COVID sceptics, climate change deniers, and flat Earthers (Osborne & Pimentel, 2023). The advent of multimodal GenAI is likely to worsen the situation in two ways. First, GenAI can be used by science deniers to maliciously create and spread authentic-looking but fake content that is not limited to written text, but also in the form of deepfake images and videos. Second, even without a malicious intention, the content from GenAI can be inaccurate and biased due to the training data fed into the LLMs, the AI’s lack of genuine and contextual understanding of the data, and any inaccuracy or bias that is contained in the user’s input itself. OpenAI (2023b), the developers of ChatGPT, openly warn that:

“ChatGPT is not free from biases and stereotypes, so users and educators should carefully review its content. It is important to critically assess any content that could teach or reinforce biases or stereotypes. Bias mitigation is an ongoing area of research for us, and we welcome feedback on how to improve. The model is skewed towards Western views and performs best in English. Some steps to prevent harmful content have only been tested in English” (para.1).

In this “AI-driven post-truth” future, there is now an urgent need to stress the role of materiality as the key foundation of scientific knowledge and practice. Research and practice in science education need to take a material turn and make a stronger linkage between our interaction with material objects and our production of scientific claims and evidence. More emphasis on evidence-based argumentation in science curriculum and instruction is also required. Although there is currently a strong research focus on scientific argumentation, the research in this area is largely framed along an epistemic and/or dialogic perspective (Erduran et al., 2015). Both of these perspectives only attend to the social aspect of scientific practice (Ford & Forman, 2006) and sideline the material aspect of scientific practice (Tang, 2022). Besides the use of rhetoric and debate to convince someone in argumentation, research in scientific argumentation must now take into account the role of materials as an integral part of forming one’s argument. If the goal of scientific argument is to convince someone, nothing beats a firsthand experience of seeing or touching something in the extensional world. For instance, anyone who has personally seen the rings of Saturn or the cloud layers of Jupiter through a telescope will invoke a stronger emotional connection and epistemological commitment to the existence of those planets than looking at images and videos of them. However, many children and adults today sadly do not have a firsthand experience of seeing a celestial object through a telescope. Science education in the future needs to provide more opportunities for students to gather evidence in the physical world and using them to support scientific argumentation.

Where possible, students should be provided with more opportunities to engage in argumentation based on firsthand data generated from direct observations or actual experiments, instead of second-hand data generated by AI or other multimedia sources. Take the Flat Earth movement as an example — it is astounding that many sensible adults and children today still doubt the Earth is round. With the proliferation of deepfakes powered by AI, photographs and videos of a round Earth and the moon landing will no longer be convincing evidence to future generations of students. One potential lesson idea to convince students of a spherical Earth is to replicate the method used by the ancient Greeks over 2000 years ago. Using synchronous video conferencing, two groups of students in separate cities of roughly the same longitude (e.g., Miami and Pittsburgh, or Oslo and Rome) can simultaneously measure the shadow from a meter stick at noon. They can then use the data to support or refute various claims about the Earth’s surface (see Tang, 2021). Another possible approach is to encourage students to contribute first-hand data from their localities to global citizen science projects, such as GLOBE (e.g., Kohl et al., 2021; Manzanarez et al., 2022), and understand how the data are used to develop different findings and claims about the environment.

While it is important to maintain skepticism regarding the evidence produced by GenAI, we should also consider ways to integrate GenAI into the teaching of scientific inquiry and argumentation. For example, GenAI output can serve as a source of evidence for students to compare with other data collected during a science investigation. This comparison can provide opportunities for students to contemplate the value of AI-generated evidence in relation to material forms of inquiry, such as evidence obtained from experiments, laboratories, or fieldwork. Through such comparisons, students can engage in experiences that encourage them to evaluate the validity and reliability of different pieces of evidence gathered during their investigations. Teachers can also foster students’ critical thinking as they explore the limitations of GenAI, including the lack of transparency associated with current LLMs like ChatGPT. They should further encourage students to critically question the quality of output from GenAI tools (Buchanan, 2023). Embracing a critical stance towards the evidence generated by GenAI enables valuable opportunities for students to assess GenAI outputs alongside material forms of inquiry.

3.3 Learning, Collaboration, and Shared Agency with GenAI

The post-humanist perspectives from new materialism and science studies highlight the significance of material objects, such as scientific apparatus, tools, instruments, and samples in the production and validation of scientific knowledge (Barad, 2003; Latour, 1987; Pickering, 1995). In light of these perspectives, we need to approach the outputs generated by GenAI with caution, as they lack contextualization to a material reality. However, we acknowledge that GenAI itself is also a material object that can contribute to the production of scientific knowledge. According to Latour’s (1987) actor-network ontology, GenAI is a non-human actor in the assemblage of allies that transform material substances into scientific evidence and claims. Consequently, GenAI will undoubtedly reshape the way we do science. In fact, it has been reported that “some scientists were already using chatbots as research assistants – to help organize their thinking, generate feedback on their work, assist with writing code, and summarize research literature” (Nature, 2023, p. 612).

A useful insight from new materialism and science studies lies in recognizing the shared agency between humans and non-humans. Hence, we cannot attribute the creation of scientific knowledge solely to humans or GenAI alone, but rather to a collaborative effort between the two. Just as the invention of the thermometer and air pump paved the way for scientific understanding of temperature and air pressure (Milne, 2019), the utilization of GenAI by scientists is likely to generate novel knowledge that would otherwise be unattainable without GenAI. Future human-AI collaborations will need to harness the strengths of both human scientists and non-human AI tools. GenAI, with its immense processing power, excels in identifying patterns within the intensional world of languages. On the other hand, humans, with their embodied experiences grounded in a material reality, play a vital role in making scientific knowledge meaningful and relevant to the extensional world that we live in.

Drawing inspiration from this collaboration between scientists and GenAI, students can also leverage GenAI as a valuable learning tool in science education. Considering the possibilities of using GenAI in the learning of science, a wide range of applications is worth exploring in future research (Cooper, 2023). Students should be encouraged to see GenAI as a tool to augment learning rather than as a definitive source. At different levels of schooling, diverse factors will support the overall objective of cultivating GenAI literacy in science-learning contexts. For instance, in primary and middle schools (ages ~ 7–13), students may use GenAI to understand unfamiliar scientific terms or definitions by simplifying ideas into more accessible language, serving as a scaffold to support their learning (e.g., ChatGPT prompt: In simple language, explain balanced and unbalanced forces). Students may also practice answering extension questions posed by the GenAI or engage in multiple-choice quizzes with answer keys to verify their responses (e.g., ChatGPT prompt: create a multiple choice quiz about forces at a year 4 level with answer key). At this level, an important focus will be ensuring students understand the technological capabilities and limitations under close adult supervision and support. For high school students (ages ~ 14–18), GenAI is likely to be helpful in promoting understanding of more complex science-related matters with a focus on critical thinking (e.g., ChatGPT prompt: How accurate are climate models in predicting global warming, considering some skepticism about their reliability?), how LLMs are trained, and the importance of verifying information with other sources of evidence. In higher education, such as undergraduate and graduate levels, GenAI can serve as a research assistant and support technical aspects of different science fields (e.g., ChatGPT prompt: Summarise the research paper I have uploaded, what are the key research findings?). While use of GenAI will be influenced by the specific field, there may be further exploration of GenAI ethics, data privacy, and use of GenAI to analyze data. These examples illustrate the potential for AI to facilitate personalized learning, offering diverse forms of support to students. It also underscores the importance of equipping students (and educators) with effective LLM prompts that can enhance science learning. In a broader sense, science educators need to provide careful guidance on the use of AI platforms, encouraging open discussions about their appropriate integration in science classrooms, the potential benefits they offer, helpful AI prompts, and critical examination of their limitations when engaging in scientific inquiry.

4 Conclusion

Materiality plays a central role in shaping what is “real” about our collective knowledge and practices of science. As we move towards a future with an increasing prevalence of GenAI, it is essential that science education acknowledges our connection to the physical world and upholding the epistemic authority of scientific knowledge. By actively recognizing and incorporating the role of materiality, we can safeguard our understanding of science, ensuring it remains firmly grounded in a material reality and the intricate interplay between humans and the physical world.